Igor’s tip of the week #51: Custom calling conventions

The Hex-Rays decompiler was originally created to deal with code produced by standard C compilers. In that world, everything is (mostly) nice and orderly: the calling conventions are known and standardized and the arguments are passed to function according to the ABI.

However, the real life is not that simple: even in code coming from standard compilers there may be helper functions accepting arguments in non-standard locations, code written in assembly, or whole program optimization causing compiler to use custom calling conventions for often-used functions. And code created with non-C/C++ compilers may use completely different calling conventions (a notable example is Go).

Thus a need arose to specify custom calling conventions so that the decompiler can provide readable output when they’re used. For this, ability to specify custom calling conventions has been added to IDA and decompiler.

Usercall

The most commonly used custom calling convention is specified using the keyword __usercall. The basic syntax is as follows:

{return type} __usercall funcname@<return argloc>({type} arg1, {type} arg2@<argloc>, ...);

where arglocis one of the following:

  • a processor register name, e.g. eax, ebx, esi etc. In some cases flag registers (zf, sf, cf etc.) may be accepted too.
  • a register pair delimited with a colon, e.g. <edx:eax>.

The register size should match the argument or return type (if the function returns void, return argloc must be omitted). Arguments without location specifiers are assumed to be passed on stack according to usual rules.

Scattered argument locations

In complicated situations a large argument (such as a structure instance) may be passed in multiple registers and/or stack slots. In such case the following descriptors can be used:

  • a partial register location: argoff:register^regoff.size.
  • a partial stack location: argoff:^stkoff.size.
  • a list of partial register and/or stack locations covering the whole argument delimited with a comma.

Where:

  • argoff – offset within the argument
  • stkoff – offset in the stack frame (the first stack argument is at offset 0)
  • register – register name used to pass part of the argument
  • regoff – offset within the register
  • size – number of bytes for this portion of the argument

regoff and size can be omitted if there is no ambiguity (i.e. whole register is used).

For example, a 12-byte structure passed in RDI and RSI could be specified like this:

void __usercall myfunc(struc_1 s@<0:rdi, 8:rsi.4>);

Userpurge

The __userpurge calling convention is equivalent to __usercall except it is assumed that the callee adjusts the stack to account for arguments passed on stack (this is similar to how __cdecl differs from __stdcall on x86).

Spoiled registers

The compiler or OS ABI also usually specifies which registers are caller-saved, i.e. may be spoiled (or clobbered) by a function call. In general, any register which can be used for argument passing or return value is considered potentially spoiled because the called function could in turn call other functions. For example, on x86, EAX, ECX, and EDX are by default considered spoiled and their values after the call are considered undefined by the decompiler. If this is not the case, you can help the decompiler by using the __spoils<{reglist}> specifier. For example, if the function does not clobber any registers, you can use the following prototype:

void __spoils<> func();

If a custom memcpy implementation uses esi and edi without saving and restoring them, you can add them to the spoiled list:

void* __spoils<esi, edi> memcpy(void*, void*, int);

The __spoils attribute can also be combined with __usercall:

int __usercall __spoils<> g@<esi>();

 

See also: Set function/item type and Scattered argument locations in IDA Help.