In the previous post we talked about the basic usage of selection in IDA. This week we’ll describe a few more examples of actions affected by selection.
When disassembling a raw binary, IDA is not always able to detect code fragments and you may have to resort to trial & error for finding the code among the whole loaded range which can be a time-consuming process. In such situation the following simple approach may work for initial reconnaissance:
IDA will go through the selected range and try to convert any undefined bytes to instructions. If there is indeed valid code in the selected area, you might see functions being added to the Functions window (probably including some false positives).
Another useful application of selection is applying structure offsets to multiple instructions. For example, let’s consider this function from a UEFI module:
.text:0000000000001A64 sub_1A64 proc near ; CODE XREF: sub_15A4+EB↑p .text:0000000000001A64 ; sub_15A4+10E↑p .text:0000000000001A64 .text:0000000000001A64 var_28 = qword ptr -28h .text:0000000000001A64 var_18 = qword ptr -18h .text:0000000000001A64 arg_20 = qword ptr 28h .text:0000000000001A64 .text:0000000000001A64 push rbx .text:0000000000001A66 sub rsp, 40h .text:0000000000001A6A lea rax, [rsp+48h+var_18] .text:0000000000001A6F xor r9d, r9d .text:0000000000001A72 mov rbx, rcx .text:0000000000001A75 mov [rsp+48h+var_28], rax .text:0000000000001A7A mov rax, cs:gBS .text:0000000000001A81 lea edx, [r9+8] .text:0000000000001A85 mov ecx, 200h .text:0000000000001A8A call qword ptr [rax+50h] .text:0000000000001A8D mov rax, cs:gBS .text:0000000000001A94 mov r8, [rsp+48h+arg_20] .text:0000000000001A99 mov rdx, [rsp+48h+var_18] .text:0000000000001A9E mov rcx, rbx .text:0000000000001AA1 call qword ptr [rax+0A8h] .text:0000000000001AA7 mov rax, cs:gBS .text:0000000000001AAE mov rcx, [rsp+48h+var_18] .text:0000000000001AB3 call qword ptr [rax+68h] .text:0000000000001AB6 mov rax, [rsp+48h+var_18] .text:0000000000001ABB add rsp, 40h .text:0000000000001ABF pop rbx .text:0000000000001AC0 retn .text:0000000000001AC0 sub_1A64 endp
If we know that gBS
is a pointer to EFI_BOOT_SERVICES
, we can convert accesses to it (in the call instructions) to structure offsets. It can be done for each access manually but is tedious. In such situation the selection can be helpful. If we select the instructions accessing the structure and press T (structure offset), a new dialog pops up:
You can select which register is used as the base, which structure to apply and even select which specific instructions you want to convert.
After selecting rax
and EFI_BOOT_SERVICES
, we get a nice-looking listing:
.text:0000000000001A64 sub_1A64 proc near ; CODE XREF: sub_15A4+EB↑p .text:0000000000001A64 ; sub_15A4+10E↑p .text:0000000000001A64 .text:0000000000001A64 Event = qword ptr -28h .text:0000000000001A64 var_18 = qword ptr -18h .text:0000000000001A64 Registration = qword ptr 28h .text:0000000000001A64 .text:0000000000001A64 push rbx .text:0000000000001A66 sub rsp, 40h .text:0000000000001A6A lea rax, [rsp+48h+var_18] .text:0000000000001A6F xor r9d, r9d ; NotifyContext .text:0000000000001A72 mov rbx, rcx .text:0000000000001A75 mov [rsp+48h+Event], rax ; Event .text:0000000000001A7A mov rax, cs:gBS .text:0000000000001A81 lea edx, [r9+8] ; NotifyTpl .text:0000000000001A85 mov ecx, 200h ; Type .text:0000000000001A8A call [rax+EFI_BOOT_SERVICES.CreateEvent] .text:0000000000001A8D mov rax, cs:gBS .text:0000000000001A94 mov r8, [rsp+48h+Registration] ; Registration .text:0000000000001A99 mov rdx, [rsp+48h+var_18] ; Event .text:0000000000001A9E mov rcx, rbx ; Protocol .text:0000000000001AA1 call [rax+EFI_BOOT_SERVICES.RegisterProtocolNotify] .text:0000000000001AA7 mov rax, cs:gBS .text:0000000000001AAE mov rcx, [rsp+48h+var_18] ; Event .text:0000000000001AB3 call [rax+EFI_BOOT_SERVICES.SignalEvent] .text:0000000000001AB6 mov rax, [rsp+48h+var_18] .text:0000000000001ABB add rsp, 40h .text:0000000000001ABF pop rbx .text:0000000000001AC0 retn .text:0000000000001AC0 sub_1A64 endp
When some code is referencing a string, IDA is usually smart enough to detect it and convert referenced bytes to a literal item. However, in some cases the automatic conversion does not work, for example:
A common example of the former is Linux kernel which uses a special byte sequence to mark different categories of kernel messages. For example, consider this function from the joydev.ko
module:
IDA did not automatically create a string at 1BC8 because it starts with a non-ASCII character. However, if we select the string’s bytes and press A (Convert to string), a string is created anyway:
This action is useful when dealing with structured data in binaries. Let’s consider a table with approximately this layout of entries:
struct copyentry { void *source; void *dest; int size; void* copyfunc; };
While such a structure can always be created manually in the Structures window, often it’s easier to format the data first then create a structure which describes it. After creating the four data items, select them and from the context menu, choose “Create struct from selection”:
IDA will create a structure representing the selected data items which can then be used to format other entries in the program or in disassembly to better understand the code working with this data.