As we’ve mentioned before, the same numerical value can be used represented in different ways even if it’s the same bit pattern on the binary level. One of the representations used in IDA is offset.
In IDA, an offset is a numerical value which is used as an address (either directly or as part of an expression) to refer to another location in the program.
The term comes from the keyword used in MASM (Microsoft Assembler) to distinguish an address expression from a variable.
For example:
mov eax, g_var1
Loads the value from the location g_var1
into register eax
. In C, this would be equivalent to using the variable’s value.
While
mov eax, offset g_var1
Loads the address of the location g_var1
into eax
. In C, this would be equivalent to taking the variable’s address.
On the binary level, the second instruction is equivalent to moving of a simple integer, e.g.:
mov eax, 0x40002000
However, during analysis the offset form is obviously preferred, both for readability and because it allows you to see cross-references to variables and be able to quickly identify other places where the variable is used.
In general, distinguishing integer values used in instructions from addresses is impossible without whole program analysis or runtime tracing, but the majority of cases can be handled by relatively simple heuristics so usually IDA is able to recover offset expressions and add cross-references. However, in some cases they may fail or produce false positives so you may need to do it manually.
All options for converting to offsets are available under Edit > Operand type > Offset:
In most modern, flat-memory model binaries such as ELF, PE, Mach-O, the first two commands are equivalent, so you can usually use shortcut O or Ctrl–O.
The most common/applicable options are also shown in the context (right-click) menu:
There may be cases when IDA’s heuristics convert a value to an offset when it’s not actually being used as one. One common example is bitwise operations done with values which happen to be in the range of the program’s address space, but it can also happen for data values or simple data movement, like on the below screenshot.
In this example, IDA has converted the second operand of the mov
instruction to an offset because it turned out to match a program address. However, we can see that it is being moved into a location returned by the call to __errno
function. This is a common way compilers implement setting of the errno
pseudo-variable (which can be thread-specific instead of a global), so obviously that operand should be a number and not an offset. Besides being a wrong representation, this also lead to bogus cross-references:
You have the following options to fix the false positive:
See also:
IDA Help: Edit|Operand types|Offset submenu
IDA Help: Edit|Operand types|Number submenu