We’ve covered function chunks last week and today we’ll show an example of how to use them in practice to handle a common compiler optimization.
When working with some ARM firmware, you may sometimes run into the following situation:
We have decompilation of sub_8098C
which ends with a strange JUMPOUT
statement and if we look at the disassembly, we can see that it corresponds to a branch to a POP.W
instruction in another function (sub_8092C
). What happened here?
This is an example of a code size optimization. The POP.W
instruction is 4 bytes long, while the B
branch is only two, so by reusing it the compiler saves two bytes. It may not sound like much, but such savings can accumulate to something substantial over all functions of the binary. Also, sometimes longer sequences of several instructions may be reused, leading to bigger savings.
Can we fix the database to get clean decompilation and get rid of JUMPOUT
? Of course, the answer is yes, but the specific steps may be not too obvious, so let’s describe some approaches.
First we need to create a chunk for the shared instructions (in our example, the POP.W
instruction). A chunk can be created only from instructions which do not yet belong to any function, thus the easiest way is to delete the function so that instructions become “free”. This can be done either from the Functions window, via Edit > Functions > Delete function menu entry, or from the modal “jump to function” list (Ctrl–P, Del).
Once deleted, the shared tail instructions can be added as a chunk to the other function. This can be done manually:
sub_8098C
). Normally IDA should suggest it automatically.Or (semi)automatically:
CODE XREF: sub_8098C+3E↓j
comment)Either solution will create the chunk and mark it as belonging to the referencing function.
We can check that it is contained in the function graph:
And the pseudocode no longer has a JUMPOUT:
We “solved” the problem for one function, but in the process we’ve destroyed the function which contained the shared tail. If we need to decompile it too, we can try to recreate it:
However, IDA ends it before the chunk, because it’s now a part of another function:
And if we decompile it, we get the same JUMPOUT
issue:
The solution is simple: as mentioned in the previous post, a chunk may belong to multiple functions, so we just need to attach the chunk to this function too:
sub_8092C
).The chunk gains one more owner, appears in the function graph, and the decompilation is fixed:
The above example had a tail shared by two functions, but of course this is not the limit. Consider this example:
Here, the POP.W
instruction is shared by seven functions, and two of them also reuse the ADD SP, SP, #0x10
instruction preceding it. There is also a chunk which belongs only to one function but it had to be separated because the function was no longer contiguous. Still, IDA’s approach to fragmented functions was flexible enough to handle it with some manual help and all involved functions have proper control flow graphs and nice decompilation.
To summarize, the suggested algorithm of handling shared tail optimization is as follows: