Hex-Rays' blog

New features in Hex-Rays Decompiler 1.6 – Hex Rays

Written by Igor Skochinsky | Oct 9, 2011

Last week we released IDA 6.2 and Hex-Rays Decompiler 1.6. Many of the new IDA features have been described in previous posts, but there have been notable additions in the decompiler as well. They will let you make the decompilation cleaner and closer to the original source. However, it might be not very obvious how to use some of them, so we will describe them in more detail.

1. Variable mapping

This is probably the simplest new feature and can be used without any extra preparation.

Sometimes the compiler stores the same variable in several places (e.g. a register and a stack slot). While the decompiler often manages to combine such locations, sometimes it’s not able to prove that they always contain the same value (especially in presence of calls that take address of stack variables). In such cases the user can help by performing such a merge or mapping manually.

Consider the following very common case:

int __stdcall SciFreeFilterInstance(_FILTER_INSTANCE *pFilterInstance) { _FILTER_INSTANCE *v1; // esi@1
  v1 = pFilterInstance; if ( pFilterInstance->Signature != 'FrtS' ) RtlAssert( "(pFilterInstance)->Signature==SIGN_FILTER_INSTANCE", "d:\\xpsprtm\\drivers\\wdm\\dvd\\class\\codinit.c", 0x17A2u, 0); StreamClassDebugPrint(2, "Freeing filterinstance %p still open streams\n", v1);

The compiler copied an incoming argument (pFilterInstance) into a register (v1==esi). To get rid of the extra name, right-click the left-hand variable and choose “Map to another variable”, or place cursor on it and press ‘=’:

Choose the right-hand variable from the list.

Once decompilation is refreshed, both the left-hand variable (v1) and the assignment are gone. Now we have only one variable – the incoming argument.

int __stdcall SciFreeFilterInstance(_FILTER_INSTANCE *pFilterInstance) {
  if ( pFilterInstance->Signature != 'FrtS' ) RtlAssert( "(pFilterInstance)->Signature==SIGN_FILTER_INSTANCE", "d:\\xpsprtm\\drivers\\wdm\\dvd\\class\\codinit.c", 0x17A2u, 0); StreamClassDebugPrint(2, "Freeing filterinstance %p still open streams\n", pFilterInstance);

You can map several variables to the same name, if necessary.

Made a mistake or mapped too much? It’s simple to fix. Right-click the wrongly mapped name and choose “Unmap variables”. Then choose the variable you want to see again.

2. Union selection.

This feature, naturally, only applies to unions. That means that you need to have union types in your database and assign the types to some variables or fields.

Normally the decompiler tries to choose a union field which matches the expression best, but sometimes there are several equally valid matches, and sometimes other types in the expression are wrong. In such cases, you can override the decompiler’s decision. For example, this code is common in Windows drivers:

NTSTATUS __stdcall DispatchDeviceControl(PDEVICE_OBJECT DeviceObject, PIRP Irp) { PIO_STACK_LOCATION stacklocation; // ebx@1 stacklocation = Irp->Tail.Overlay.CurrentStackLocation; if ( *&stacklocation->Parameters.Create.FileAttributes == 0x224010 ) { v8 = stacklocation->Parameters.Create.Options == 20; if ( !v8 ) goto LABEL_18; if ( stacklocation->Parameters.Create.SecurityContext < 1 ) goto LABEL_87; v23 = Irp->AssociatedIrp.MasterIrp;

Since we know we’re in a DeviceControl handler, it’s likely the code is inspecting the Parameters.DeviceIoControl substructure and not Parameters.Create.

Right-click the field and choose “Select union field”, or place cursor on it and press Alt-Y.

Choose the Parameters.DeviceIoControl.IoControlCode field.

Other references to Parameters.Create can be fixed the same way. The updated decompilation makes more sense:

NTSTATUS __stdcall DispatchDeviceControl(PDEVICE_OBJECT DeviceObject, PIRP Irp) { PIO_STACK_LOCATION stacklocation; // ebx@1 stacklocation = Irp->Tail.Overlay.CurrentStackLocation; if ( stacklocation->Parameters.DeviceIoControl.IoControlCode == 0x224010 ) { v8 = stacklocation->Parameters.DeviceIoControl.InputBufferLength == 20; if ( !v8 ) goto LABEL_18; if ( stacklocation->Parameters.DeviceIoControl.OutputBufferLength < 1 ) goto LABEL_87;

3. CONTAINING_RECORD macro

This macro is commonly use in Windows drivers to get a pointer to the parent structure when we have a pointer to one of its fields.

For example, consider these two structures, used in a driver:

struct _HW_STREAM_OBJECT {
  ULONG  SizeOfThisPacket;
  ULONG  StreamNumber;
  PVOID  HwStreamExtension;
  ...
} HW_STREAM_OBJECT, *PHW_STREAM_OBJECT;
struct _STREAM_OBJECT
{
  _COMMON_OBJECT ComObj;
  _FILE_OBJECT *FilterFileObject;
  _FILE_OBJECT *FileObject;
  _FILTER_INSTANCE *FilterInstance;
  _HW_STREAM_OBJECT HwStreamObject;
  ...
};

The following function accepts a pointer to _HW_STREAM_OBJECT:

void __cdecl StreamClassStreamNotification(
  int NotificationType,
  _HW_STREAM_OBJECT *StreamObject,
  _HW_STREAM_REQUEST_BLOCK *pSrb,
  _KSEVENT_ENTRY *EventEntry,
  GUID *EventSet,
  ULONG EventId);

But immediately converts it into the containing _STREAM_OBJECT:

mov eax, [ebp+StreamObject] test eax, eax push ebx push esi lea esi, [eax-_STREAM_OBJECT.HwStreamObject]

Default decompilation doesn’t look great:

 char *v6; // esi@1
  v6 = (char *)&StreamObject[-2] - 36;

There are two ways to make it nicer:

  1. Change type of v6 to be _STREAM_OBJECT*. The decompiler will detect that the expression “lines up” and convert it to use the macro.
  2. Right-click on the delta being subtracted (-36), select “Structure offset” and choose _STREAM_OBJECT from the list.

In both cases you should get a nice expression:

 v6 = CONTAINING_RECORD(StreamObject, _STREAM_OBJECT, HwStreamObject);

N.B.: currently you need to refresh the decompilation (press F5) to see the changes. We’ll improve it to happen automatically in future.

4. Kernel and user-mode macros involving fs segment access.

On Windows, the fs segment is used to store various thread-specific (for user-mode) or processor-specific (for kernel mode) data. Hex-Rays Decompiler 1.6 detects the most common ways of accessing them and converts them to corresponding macros. However, this functionality requires presence of specific types in the database. For user mode, it is the _TEB structure, for kernel mode it’s the KPCR structure.

For example, consider the following code:

mov eax, large fs:18h mov eax, [eax+30h] push 24h push 8 push dword ptr [eax+18h] call ds:__imp__RtlAllocateHeap@12 ; RtlAllocateHeap(x,x,x) mov esi, eax

If you don’t have the _TEB structure in types, this will be decompiled to:

  v5 = RtlAllocateHeap(*(_DWORD *)(*(_DWORD *)(__readfsdword(24) + 48) + 24), 8, 36);

However, if you do add the type, it will look much nicer:

  v5 = RtlAllocateHeap(NtCurrentTeb()->ProcessEnvironmentBlock->ProcessHeap, 8, 36);

Currently we support the following macros:

Macro Required types
NtCurrentTeb _TEB
KeGetPcr KPCR
KeGetCurrentPrcb KPCR, KPCRB
KeGetCurrentProcessorNumber KPCR
KeGetCurrentThread KPCR, _KTHREAD

Hint: the easiest way to get _TEB or KPCR types into your database is using the PDB plugin. Invoke it from File|Load file|PDB file…, enter a path to kernel32.dll (for user-mode code) or ntoskrnl.exe (for kernel-mode code), and check the “Types only” checkbox.

PDBs for those two files usually contain the necessary OS structures.

We hope you will like these new additions. Note that the version 1.6 includes even more improvements and fixes, see the full list of the new features and the comparison page.