The Hex-Rays decompiler was initially created to decompile C code, so its pseudocode output uses (mostly) C syntax. However, the input binaries may be compiled using other languages: C++, Pascal, Basic, ADA, and many others. While the code of most of them can be represented in C without real issues, some have peculiarities which require language extensions or have to be handled with user input. Still, some languages use approaches so different from standard compiled С code that special handling for that is necessary. For example, Go uses a calling convention (stack-based or register-based) so different from standard C calling conventions, that custom support for it had to be added to IDA.
Even with custom calling conventions, one fundamental limitation of IDA’s type system remains (as of IDA 8.0): a function may return only a single value. However, even in otherwise C-style programs you may encounter functions which return more than one value. One example is compiler helpers like idivmod
/uidivmod
. They return simultaneously the quotient and remainder of a division operation. The decompiler knows about the standard ones (e.g. __aeabi_idivmod
for ARM EABI) but you may encounter a non-standard implementation, or an unrelated function using a similar approach (e.g. a function written manually in assembly).
Because the decompiler does not expect that function returns more than one value, you may need to inspect the disassembly or look at the place of the call to recognize such functions. For example, here’s a fragment of decompiled ARM32 code which seems to use an undefined register value:
The function seems to modify the R1
register, although normally the return values (for 32-bit types) are placed in R0. Possibly this is an equivalent of divmod function which returns quotient in R0
and remainder in R1
?
To handle this, we can use an artificial structure and a custom calling convention specifying the registers and/or stack locations where it should be placed. For example, add such struct to Local Types:
struct divmod_t { int quot; int rem; };
and set the function prototype: divmod_t __usercall my_divmod@<R1:R0>(int@<R0>, int@<R1>);
The decompiler then interprets the register values after the call as if they were structure fields:
A similar approach may be used for languages with native support for functions with multiple return values: Go, Swift, Rust etc.
See also:
Igor’s tip of the week #51: Custom calling conventions