A binary analysis tool like a decompiler is incomplete without a programming interface.
Sure, decompilers tremendously facilitate binary analysis. You can concentrate
of the program logic expressed in a familiar way. Just add comments, rename variables
and functions to get almost the original source code, almost perfect. However, quite often there
is a small ugly detail and the output falls short of being satisfactory.
It can be because of an awkward expression
which could be represented more concisely:
It can also be an inline function
*(_BYTE *)v17++ = 0;
which could be collapsed:
It can be a while-loop
v4 = wcstok(&Str, L".");
if ( v4 )
v9 = (unsigned __int16)j___wtol(v4) << v7;
v6 |= v9;
v5 |= *((_DWORD *)&v9 + 1);
v4 = wcstok(NULL, L".");
v7 -= 16;
while ( v7 >= 0 && v4 );
which could be converted into a for-loop:
shift >= 0 && ptr;
ptr=wcstok(NULL, L"."), shift-=16 )
v6 |= (ushort)wtol(ptr) << shift;
v5 |= codepage;
All these transformations improve the readability but the decompiler can not perform them
automatically: they change the meaning of the program. Only the user who knows
that these transformations can be safely applied should activate them.
We could add extensive set of manual
transformation commands to the decompiler (we might do it one day), but there are really too many of them.
Besides, some transformations can be applied only in some particular circumstances proper to a particular
version of a compiler used with particular command line options.
In short, there is no way we can predict all possible transformations and implement them.
Hex-Rays SDK allows you to manipulate the decompilation result as you want.
You can play with the output data structure (called ctree), modify it, rename variables, and change their types.
Watch such a plugin in action:
This plugin introduces a new command to swap if branches. I personally prefer to have
the shorter if branch first: shorter means simpler.
Having simplest problems to be solved first is a good approach in programming, it frees
one’s mind for complex problems and makes the unsolved part of the problem shorter (thus hopefully simpler 😉
Other things you can do with the current SDK:
- Decompile any function
- Modify the pseudocode
- Change local variable names and types
- Introduce your own interactive commands
- Install callbacks to react to decompiler events
The above functionality it enough to implement the Inliner, Exporter, Transformer, and Vizier(partially)
plugins mentioned here.
In the future we will add support for other plugin types. The decompiler will handle other target processors
and data flow analysis functions will be exported. This will allow you to write more
complex analysis and transformation rules.
What about writing your own vulnerability scanner based on Hex-Rays? 😉
It is quite difficult today but will be within reach very soon.