When decompiling code without high-level metadata (especially firmware), you may observe strange-looking address expressions which do not seem to make sense.
What are these and how to fix/improve the pseudocode?
Because on the CPU level there is no difference between an address and a simple number, distinguishing addresses and plain numbers is a difficult task which is not solvable in general case without actually executing the code. IDA uses some heuristics to try and detect when a number looks like an address and convert such numbers to offsets, but such heuristics are not always reliable and may lead to false positives. This can be especially bad when the database has valid addresses around 0
, because then many small numbers look like addresses. The decompiler relies on IDA’s analysis and uses the information provided by it to produce the pseudocode which is supposed to faithfully represent behavior of the machine code. However, this can backfire in case the analysis made a mistake. Thankfully, IDA is interactive and allows you to fix almost anything.
In situation like above, usually the simplest algorithm is as follows:
- position cursor on the wrong address expression
- press Tab to switch to disassembly. You should land on or close to the wrong offset expression. Note that it does not always match what you see in the pseudocode.
- convert it to a plain number, e.g. by pressing Q (hex), H (decimal) or # (default).
- press Tab to switch back to pseudocode and F5 to refresh it. The wrong expression should be converted to plain number or another context-dependent expression.