State-of-the-art binary code analysis tools

As we’ve mentioned before, the I in IDA stands for interactive, and we already covered some of the disassembly view’s interactive features like renaming or commenting. However, other changes are possible too. For example, you can change the operand representation (sometimes called operand type in documentation). What is it about?

Most assemblers (and disassemblers) represent machine instructions using a mnemonic (which denotes the basic function of the instruction) and operands on which it acts (commonly delimited by commas). As an example, let’s consider the most common x86 instruction mov, which copies data between two of its operands. A few examples:

mov rsp, r11 – copy the value of r11 to rsp

mov rcx, [rbx+8] – copy a 64-bit value from the address equal to value of the register rbx plus 8 to rcx (C-like equivalent: rcx = *(int64*)(rbx+8);)

mov [rbp+390h+var_380], 2000000h – copy the value 2000000h (0x2000000 in C notation) to the stack variable var_380

The first example uses two registers as operands, the second a register and an indirect memory operand with base register and displacement, the third — another memory operand as well as an immediate (a constant value encoded directly in the instruction’s opcode).

The last two examples are interesting because they involve numbers (displacements and immediates), and the same number can be represented in multiple ways. For example, consider the following instructions:

mov eax, 64h
mov eax, 100
mov eax, 144o
mov eax, 1100100b
mov eax, 'd'
mov eax, offset byte_64
mov eax, mystruct.field_64

All of them have exactly the same byte sequence (machine code) on the binary level: B8 64 00 00 00. So, while picking another operand representation may change the visual aspect, the underlying value and the program behavior does not change. This allows you to choose the best variant which represents the intent behind the code without having to add a long explanation in comments.

The following representations are available in IDA for numerical operands (some of them may only make sense in specific situations):

  1. Default number representation (aka void): used when there is no specific override applied on the operand (either by the user or IDA’s autoanalyzer or the processor module). The actually used representation depends on the processor module but the most common fallback is hexadecimal. Uses orange color in the default color scheme. For values which match a printable character in the current encoding, a comment with the character could be displayed (depends on the processor module).
    Hotkey: # (hash sign).
  2. Decimal: shows the operand as a decimal number. Hotkey is H.
  3. Hexadecimal: explicitly show the operand as hexadecimal. Hotkey is Q.
  4. Binary: shows the operand as a binary number. Hotkey is B.
  5. Octal: shows the operand as an octal number. No default hotkey but can be picked from the context menu or the “Operand type” toolbar.
  6. Character: shows the operand as a character constant if possible. Hotkey: R.
  7. Structure offset: replaces the numerical operand with a reference to a structure member with a matching offset. Hotkey: T.
  8. Enumeration (symbolic constant): the number is replaced by a symbolic constant with the same value. Hotkey: M.
  9. Stack variable: the number is replaced by a symbolic reference into the current function’s stack frame. Usually only makes sense for instructions involving stack pointer or frame pointer. Hotkey: K.
  10. Floating-point constant: only works in some cases and for some processors. For example, 3F000000h(0x3F000000) is actually an IEEE-754 encoding of the number 0.5. There is no default hotkey but the conversion can be performed via the toolbar or main menu.
  11. Offset operand: replace the number by an expression involving one or more addresses in the program. Hotkeys: O, CtrlO or CtrlR (for complex offsets).

All hotkeys revert to the default representation if applied twice.

In addition to the hotkeys, the most common conversions can be done via the context menu:

The full list is available in the main menu (Edit > Operand Type):

as well as the “Operand Type” toolbar:

Two more transformations can be applied to an operand on top of changing its numerical base:

  1. Negation. Hotkey _ (underscore). Can be used, for example, to show -8 instead of 0FFFFFFF8h (two representations of the same binary value).
  2. Bitwise negation (aka inversion or binary NOT). Hotkey: ~ (tilde). For example, 0FFFFFFF8h is considered to be the same as not 7.

Finally, if you want to see something completely custom which is not covered by the existing conversions, you can use a manual operand. This allows you to replace the operand by an arbitrary text; it is not checked by IDA so it’s up to you to ensure that the new representation matches the original value. Hotkey: AltF1.