Igor’s Tip of the Week #132: Finding “hidden” cross-references

When analyzing firmware or other binaries without metadata, IDA is not always able to discover and analyze all functions which means the cross-references can be missing. Let’s say you found a string in the binary (e.g. in the String list) which has no cross references, but you’re reasonably sure it’s actually used. How to discover where?

Finding addresses using binary search

One possibility is that the string is referred to by its address value, either from a pointer somewhere, or as an immediate value embedded directly in the instruction (the latter case is more common for CISC instruction sets such as x86). In such case, looking for the address value should discover it.

For example, here’s a string in an ARM firmware which currently has no cross-references:

We can try the following:

  1. Select and copy to clipboard the string’s address (C3E31B49);
  2. Go to the start of the database (CtrlPgUp or Home, Home, Home);
  3. Invoke binary search (Search > Sequence of bytes…, or AltB);
  4. Paste the address and make sure that Hex is selected. It is also recommended to enable Match case to avoid false positives:
  5. Click OK. IDA will automatically convert the value into a byte sequence corresponding to the processor endianness and look for it in the database:

The value may be initially displayed as a raw number or even separate bytes. To convert it to an offset so that xref is created you can usually use the O or CtrlO shortcuts, or the context menu:

Now the string has a cross-reference and you can look further at where exactly it is used:

 

Finding addresses using immediate search

Binary search works for addresses embedded as-is into the binary. However, there may be situations where an address is embedded into an instruction not on a byte boundary, or split between several instructions. For example, RISC-V usually has to use at least two instructions to load a 32-bit value into a register (high 20 bits and low 12 bits). In case these instructions are next to each other, IDA can combine them into a single macroinstruction and calculate the full value, but because it’s split between two instructions, binary search won’t find it. However, immediate search (Search > Immediate value…, or AltI) should work. Note that if you copy the address from the listing, you’ll need to add 0x so that it can be parsed as hexadecimal by IDA.

NOTE: this approach will succeed only under the following conditions:

  1. the instruction(s) using the address were actually decoded. You can try the approach described in Tip #04 to try disassembling the whole binary before looking for cross-references;
  2. the instructions were actually combined into a macro with the full address. For example, if they are interleaved with unrelated instructions, IDA won’t be able to combine them and you may need to look for each part separately.

 

Unfortunately, even the methods described here are not always enough. For example, self-relative offsets will likely require analyzing the code to figure out what they refer to.

See also: 

Igor’s tip of the week #95: Offsets

Igor’s Tip of the Week #114: Split offsets