Hex-Rays' blog

Finding instructions – Hex Rays

Written by   Elias Bachaalany | Sep 21, 2009

Searching for instructions and opcodes is a basic necessity for security researchers, therefore to address this issue IDA Pro provides many search facilities, among them we list:

  • Text search: Used to search the listing for text patterns (regular expressions are allowed). One can write a regular expression to find any assignment to the eax register (with the mov instruction)
  • Binary search: Allows you to search for binary patterns with wildcard support. It is also possible to search for strings alongside with the binary patterns.
  • Immediate search: Very useful to find constants and magic numbers used in the program.
  • Please refer to the search menu for other search facilities

None of the existing search facilities allow us to readily search for instructions and opcodes. In order to do that, one has to assemble the instruction in question then use the Binary Search to find the pattern.

Each processor module in IDA can implement the assemble notification callback:

assemble,               // Assemble an instruction
                        // (display a warning if an error is found)
                        // args:
                        //  ea_t ea -  linear address of instruction
                        //  ea_t cs -  cs of instruction
                        //  ea_t ip -  ip of instruction
                        //  bool use32 - is 32bit segment?
                        //  const char *line - line to assemble
                        //  uchar *bin - pointer to output opcode buffer
                        // returns size of the instruction in bytes

Once this callback is implemented by the processor module one can then assemble instructions by calling the ph.notify() with the assemble notification code (please check this forum discussion here).
Currently, only the pc processor module implements this callback and provides a very basic assembler.
We wrote a script that allows you to search for opcodes and assembly statements, so for example to find the “33 c0” (xor eax, eax), followed by “pop ebp” and followed by “ret” we could search like this:

find("33 c0;pop ebp;ret")

That’s the script operation in brief:

  1. Do some input initial validation
  2. Split the patterns
  3. Loop:
    1. Determine if the pattern is an assembly instruction or opcode list (using a simple regular expression)
    2. If pattern is an instruction then assemble it
    3. Accumulate the assembled (or converted opcodes) into a single buffer
  4. Now that we have one single binary buffer we can search for it with FindBinary()
  5. Display the result



The script uses the Assemble() function (available in IdaPython r233 and above). Comments and suggestions are welcome.