State-of-the-art binary code analysis tools

We covered how to search for things in choosers (list views),  but what if you need to look for something elsewhere in IDA?

Text search

When searching for textual content, the same shortcut pair (AltT to start, CtrlT to continue) works almost anywhere IDA shows text:

  • Disassembly (IDA View)
  • Hex View
  • Decompiler output (Pseudocode)
  • Output window
  • Structures and Enums windows
  • Choosers (list views)

This search matches text anywhere in the current view, for example both the instructions and comments, if present.

For the main windows, the action is also accessible via the Search > Text… menu.

The notice “(slow!)” refers to the fact that for text searching, IDA has to render all text lines in the range being searched, which can get quite slow, especially for big binaries. However, if you need the features like regexp matching, or searching for text in comments, the wait could be worth it.

Binary search

Available as the shortcut pair AltB/CtrlB, or Search > Sequence of bytes…, this feature allows searching for byte sequences (including string literals) and patterns in the database (including process memory during debugging). 

The input line accepts the following inputs:

  1. byte sequence (space-delimited): 01 02 03 04
  2. byte sequence with wildcard bytes represented by question marks:  68 ? ? ? 0 will match both  68 C4 1A 48 00 and 68 D8 1A 48 00.
  3. one or more numbers in the selected radix (hexadecimal, decimal or octal). The number will be converted to the minimal necessary number of bytes according to the current processor endianness. For example, 04469E0 will be converted to E0 69 44 on x86 (a little-endian processor). This feature is useful for finding values in data areas or embedded in instructions (immediates).
  4. Quoted string literals, for example "Error". The string will be converted to bytes using the encoding specified in the encoding selector. If “All Encodings” is selected, search will be performed using all configured encodings.
  5. Wide-character string constant (e.g. L"test"). Only UTF-16 is used convert such strings to raw bytes.

Immediate search

As mentioned previously, the same instruction operand can be represented in different ways in IDA. For example, an instruction like

test dword ptr [eax], 10000h

can be also displayed as

test dword ptr [eax], 65536

or even

test dword ptr [eax], AW_HIDE

So if you do the text search for 10000h, IDA will find the first variation but not the other two. On x86, you can use binary search for 10000 hex (will be converted to byte sequence 00 00 01), but this will not work for processors which use instruction encodings on non-byte boundary, or may give many false positives if unrelated instructions happen to match the byte sequence. So here’s why the immediate search is preferable:

  1. it only checks instructions with numerical operands or data items, improving search speed and reducing false positives;
  2. it compares the numerical value of the operand, so any change in representation does not prevent the match, meaning it will find any of the three variations above

Available as the shortcut pair AltI/CtrlI, or Search  > Immediate value…

The value can be entered in any numerical base using the C syntax (decimal, hex, octal).

Search direction

By default, all searches are performed “down” from the current position, i.e. toward increasing addresses. You can change it by checking “Search Up” in the individual search dialogs or beforehand  via Search  > Search direction. The currently set value is displayed in the menu item as well as IDA’s status bar.

The “search next” commands and shortcuts (CtrlT, CtrlB, CtrlI) also use this setting.

 

Find all occurrences

This checkbox allows you to get results of the search over whole database or view in a list which you can then inspect at your leisure instead of looking at every search hit one by one.

Picking the search type

This is not a definitive guide but here are some suggestions:

  1. text (e.g. prompt or error message) displayed by the program: binary search for the quoted substring (NB: this will not work if the string is not hardcoded but is in an external file or resource stream not loaded by IDA).
  2. magic constant or error code: immediate search (in some cases binary search for the value can work too).
  3. an address to which there are no apparent cross references: binary search for the address value (will only succeed if the reference actually uses the value directly without calculating it in some way).
  4. specific instruction opcode pattern: binary search for byte sequence (possibly with wildcard bytes).
  5. instruction not having a fixed encoding: text search for mnemonic and/or operands (possibly as regexp).

More info: Search submenu