Igor’s tip of the week #53: Manual switch idioms

IDA supports most of the switch patterns produced by major compilers out-of-box and usually you don’t need to worry about them. However, occasionally you may encounter a code which has been produced by an unusual or a very recent compiler version, or some peculiarity of the code prevented IDA from recognizing the pattern, so it may become necessary to help IDA and tell it about the switch so a proper function graph can be presented and decompiler can produce nice pseudocode.

Switch pattern components

The common switch pattern is assumed to have the following components:

  1. indirect jump
    This is an instruction which actually performs the jump to the destination block handling the switch case; usually involves some register holding the address value;
  2. jump table
    A table of values, containing either direct addresses of the destination blocks, or some other values allowing to calculate those addresses (e.g. offsets from some base address). It has to be of a specific fixed size (number of elements) and the values may be scaled with a shift value. Some switches may use two tables, first containing indexes into the second one with addresses.
  3. input register
    register containing the initial value which is being used to determine the destination block. Most commonly it is used to index the jump table.

Switch formula

The standard switches are assumed to use the following calculation for the destination address:

target = base +/- (table_element << shift)

base and shift can be set to zero if not used.

Example

Here’s a snippet from an ARM64 firmware.

The indirect jump is highlighted with the red rectangle. Here’s the same code in text format:

__text:FFFFFF8000039F88 STP X22, X21, [SP,#-0x10+var_20]!
__text:FFFFFF8000039F8C STP X20, X19, [SP,#0x20+var_10]
__text:FFFFFF8000039F90 STP X29, X30, [SP,#0x20+var_s0]
__text:FFFFFF8000039F94 ADD X29, SP, #0x20
__text:FFFFFF8000039F98 MOV X19, X0
__text:FFFFFF8000039F9C BL sub_FFFFFF80000415E4
__text:FFFFFF8000039FA0 B.HI loc_FFFFFF800003A01C
__text:FFFFFF8000039FA4 MOV W20, #0
__text:FFFFFF8000039FA8 ADR X9, byte_FFFFFF8000048593
__text:FFFFFF8000039FAC NOP
__text:FFFFFF8000039FB0 ADR X10, loc_FFFFFF8000039FC0
__text:FFFFFF8000039FB4 LDRB W11, [X9,X8]
__text:FFFFFF8000039FB8 ADD X10, X10, X11,LSL#2
__text:FFFFFF8000039FBC BR X10

We can see that the register used in the indirect branch (X10) is a result of some calculation so it is probably a switch pattern. However, because the code was compiled with size optimization (the range check is moved into a separate function used from several places), IDA was not able to match the pattern in the automatic fashion. Let’s see if we can find out components of the standard switch described above.

The formula matches the instruction ADD X10, X10, X11,LSL#2(in C syntax: X10 = X10+(X11<<2)). We can see that the table element (X11) is shifted by 2 before being added to the base (X10). The value of X11 comes from the  previous load of W11 using LDRB (load byte) from the table at X9 and index X8. Thus:

  1. Indirect jump: yes, the BR X10 instruction at FFFFFF8000039FBC.
  2. jump table: yes, at byte_FFFFFF8000048593. Additionally, we have a base at loc_FFFFFF8000039FC0 and shift value of 2. It contains eight elements (this can be checked visually or deduced from the range check which uses 7 as the maximum allowed value).
  3. input register: yes, X8 is used to index the table (we can also use W8 which is the 32-bit part of X8 and is used by the range check function.

Now that we have everything, we can specify the pattern by putting the cursor on the indirect branch and invoking Edit > Other >  Specify switch idiom…

The values can be specified in C syntax (0x…) or as labels thanks to the expression evaluation feature. Once the dialog is confirmed, we can observe the switch nicely labeled and function graph updated to include newly reachable nodes.

We can also use “List cross-references from…” (CtrlJ) to see the list of targets from the indirect jump.

Additional options

Our example was pretty straightforward but in some cases you can make use of the additional options in the dialog.

  1. separate value table is present: when a two-level table is used, i.e.:

    table_element = jump_table[value_table[input_register]]; instead of the default table_element = jump_table[input_register];

  2. signed jump table elements: when table elements are loaded using a sign-extension instruction, for example LDRSB or LDRSW on ARM or movsx on x86.
  3. Subtract table elements: if the values are subtracted from the base instead of being added (minus sign is used in the formula).
  4. Table element is insn: the “jump table” contains instructions instead of data values. This is used in some architectures which can perform relative jumps using a delta value from the instruction pointer. For example, the legacy ARM jumps using direct PC manipulation:
                    CMP             R3, #7 ; SWITCH ; switch 8 cases             
                    ADDLS           PC, PC, R3,LSL#2 ; switch jump               
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_6684                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               def_6680 ; jumptable 00006680 default case, c
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_6688                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_66A8 ; jumptable 00006680 case 0         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_668C                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_6854 ; jumptable 00006680 case 1         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_6690                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_68CC ; jumptable 00006680 case 2         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_6694                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_695C ; jumptable 00006680 case 3         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_6698                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_6990 ; jumptable 00006680 case 4         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_669C                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_69FC ; jumptable 00006680 case 5         
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_66A0                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               def_6680 ; jumptable 00006680 default case, c
    ; ---------------------------------------------------------------------------
                                                                                 
    loc_66A4                                ; CODE XREF: __pthread_manager+1BC↑j 
                    B               loc_699C ; jumptable 00006680 case 7         
    ; ---------------------------------------------------------------------------
    

    Usually in such situation the table “elements” are fixed-size branches to the actual destinations.

Optional values

Some values can be omitted by default but you may also fill them for a more complete mapping to the original code:

  1. Input register of switch: can be omitted if you only need cross-references for the proper function flow graph but it has to be specified if you want decompiler to properly parse and represent the switch.
  2. First(lowest) input value: the value of the input register corresponding to the entry 0 of the jump table. In the example above we can see that the range check calculates W8 = W1 - 1, so we could specify lowest value of 1 (this would also update the comments at the destination addresses to be 1 to 8 instead of 0 to 7).
  3. default jump address: the address executed when the input range check fails (in our example – destination of the B.HI instruction). Can make the listing and/or decompilation a little more neat but is not strictly required otherwise.

For even more detailed info about supported switch patterns, see the switch_info_t structure and the uiswitch plugin source code in the SDK. If you encounter a switch which cannot be handled by the standard formula, you can also look into writing a custom jump table handler.