This is a guest entry written by Dennis Elser from Trenchant Advanced Research Center (formerly Azimuth Security). His views and opinions are his own and not those of Hex-Rays. Any technical or maintenance issues regarding the code herein should be directed to the author.
HRDevHelper is a decompiler plugin that takes advantage of the Interactive Disassembler’s built-in graph-rendering engine in order to visualize the Abstract Syntax Tree (i.e., ctree
) of pseudo-code functions generated by the Hex-Rays decompiler. The plugin also comes with a context viewer that displays structural information on the current decompiled function.
HRDevHelper has been developed for the purpose of:
Other than using the decompiler’s context menu, the HRDevHelper viewers can be opened using keyboard shortcuts. There exist ctree viewers and a context viewer, whose modes of operation are explained in this guest blog.
The HRDevHelper’s viewers can be invoked from a pop-up menu or by using keyboard shortcut
In its default configuration, pressing the key combination Ctrl–. opens up and attaches a ctree graph view to the current pseudocode window’s righthand side. The pseudocode view and the graph are now contextually linked to each other and allow for interactive exploration of the ctree using the mouse or the keyboard. Placing the cursor on any pseudocode element visually highlights and centers the graph on corresponding ctree nodes. If, instead, a static view is desired, auto-centering the graph can be disabled by pressing the C key, with a graph’s widget focused.
HRDevHelper establishes context between decompiled code and the graph by highlighting nodes
Likewise, the Ctrl–Shift–. key combination creates an isolated graph from a pseudocode fragment for a more focused overview or later reference. It could be thought of as “cutting off” a branch of a ctree, as can be seen in the image below.
Parts of the graphs can be isolated from the entire ctree for better overview and later reference
Finally, there is a context view in HRDevHelper that can be opened by pressing the C key from within a decompiler’s widget. It further displays the decompiler’s internals in a textual representation. The viewer’s .op
field, for example, reflects the current ctree item’s code, which is a Hex-Rays representation of the underlying decompiled instruction or set of instructions. It can be either a statement code cit_...
or an expression code cot_...
.
One example of such a statement code is cit_if
, which stands for a C-like if
statement, whereas cot_call
is an example of an expression code that indicates a function call. A comprehensive list of ctree item codes can be found online as part of the Hex-Rays decompiler SDK reference or in the Hex-Rays decompiler SDK that comes with a local installation of the decompiler hexrays.hpp
.
The .op
field shown by the context viewer refers to the item code pointed to by the screen cursor
Ctree statement and expression codes are one of the Hex-Rays decompiler’s abstraction layers of machine code that can be accessed programmatically via C++ using the Hex-Rays SDK or via Python by using IDAPython bindings.
The HRDevHelper context viewer takes this concept further by linking individual ctree items using logical AND
operations and by creating Python expressions from it.
As shown in the screenshot below, the result is a programmatic description of a pseudo-C code pattern that is not only independent from a processor architecture but also decoupled from non-structural detail.
The generated expression shown below structurally describes the memcpy
call that can be seen on the left-hand side of the screenshot. Its first item code is a call cot_call
to an object ( cot_obj
, which has a field that contains the target function’s address) and whose first and second function arguments are stack or register variables cot_var
, and its third argument is a number cot_num
.
The HRDevHelper context viewer generates and displays programmatic descriptions of code patterns on-the-fly
One of the practical outcomes of having these expressions at hand is when combining them with the Hex-Rays tree visitor class, which opens up the door for finding code patterns in binaries independent from their respective CPU architecture.
Example use cases comprise:
Assuming we were interested in finding calls to the memcpy
function but wanted to exclude those calls whose third argument is an explicit fixed number, the IDAPython script below would help us do just that. It is based on the expression shown in the screenshot above, and has been wrapped in a Hex-Rays tree visitor class.
import ida_name, ida_hexrays, ida_funcs class memcpy_finder_t(ida_hexrays.ctree_visitor_t): def __init__(self): ida_hexrays.ctree_visitor_t.__init__(self, ida_hexrays.CV_FAST) def _process(self, i): # find function calls but with the following # restrictions in place: # - the called function's symbolic name must contain 'memcpy' # - the call's number of arguments must be three # - the call's 3rd argument must not be an explicit number found = (i.op is ida_hexrays.cot_call and i.x.op is ida_hexrays.cot_obj and "memcpy" in ida_name.get_name(i.x.obj_ea) and len(i.a) == 3 and i.a[2].op is not ida_hexrays.cot_num) # once found, print address of current item's address # (whose item code happens to be cot_call) if found: print("memcpy() found at %x" % i.ea) return 0 # process expressions def visit_expr(self, e): return self._process(e) # for the sake of completeness, also process statements def visit_insn(self, i): return self._process(i) # process all of the IDA database's functions for i in range(ida_funcs.get_func_qty()): # get 'func_t' structure f = ida_funcs.getn_func(i) if f: # get cfunc_t structure cfunc = ida_hexrays.decompile(f) if cfunc: # run the visitor class mf = memcpy_finder_t() mf.apply_to(cfunc.body, None)
Naturally, more complex search expressions are possible. Reading both the Hex-Rays Decompiler Primer by Elias Bachaalany and the Hex-Rays SDK will surely help in automating and improving the efficiency of search endeavors!
Alternatively, the above script may also be optimized for speed by resolving and decompiling only the memcpy
function’s callers as opposed to decompiling each and every of the target binary’s functions. It comes with a minor disadvantage, however: the script would certainly miss those memcpy
calls that may have been inlined by the linker (e.g., using the rep movsb
instruction on x86) but which are successfully identified as memcpy
by the decompiler.
HRDevHelper is free and open-source software written in IDAPython. Being a plugin for the Hex-Rays decompiler, it requires valid IDA and Hex-Rays decompiler licenses, along with a Python 3 interpreter that comes with the IDA installer. Other than that, the plugin has no further dependencies and can freely be downloaded or cloned from its official github repository.
As both its code and installation instructions may be subject to change, it is advised to follow the installation instructions found in the HRDevHelper repository.
After running the plugin for the first time, a configuration file HRDevHelper.cfg
is created in the IDA user directory’s cfg
sub-folder. The file can be opened and edited with a text editor in order to customize the plugin’s default settings: keyboard shortcuts, color palettes, and the ctree graph’s default docking behavior, to name a few examples.
The author would like to thank Hex-Rays / the IDAPython project for having provided the vds5 sample application on which HRDevHelper is based on. The author would also like to thank past and future contributors to the HRDevHelper plugin. The IDA color scheme used in the screenshots of this blog post is based on long_night.