State-of-the-art binary code analysis tools

The stack frame is part of the stack which is managed by the current function and contains the data used by it.

Background

The stack frame usually contains data such as:

  • local and temporary variables;
  • incoming arguments (for calling conventions which use stack for passing arguments);
  • saved volatile registers;
  • other bookkeeping information (e.g. the return address on x86).

Because the stack may change unpredictably during execution, the stack frame and its parts do not have a fixed address. Thus, IDA uses a pseudo structure to represent its layout. This structure is very similar to other structures in the Structures view, with a few differences:

  1. The frame structure has no name and is not included in the global Structures list; it can only be reached from the corresponding function
  2. Instead of offsets from the structure start, offsets from the frame pointer are shown (both positive and negative);
  3. It may contain special members to represent the saved return address and/or saved register area.

Stack frame view

 To open the stack frame view:

  • Edit > Functions > Stack variables… or press CtrlK while positioned in a function in disassembly (IDA View);
  • Double-click or press Enter on a stack variable in the disassembly or pseudocode.

In this view, you can perform most of the same operations as in the Structures view: 

  1. Define new or change existing stack variables (D);
  2. Rename variables (N)
  3. Create arrays (*) or structure instances (AltQ)

Example

Consider this vulnerable program:

#include <stdio.h>
int main () {
    char username[8];
    int allow = 0;
    printf external link("Enter your username, please: ");
    gets(username); // user inputs "malicious"
    if (grantAccess(username)) {
        allow = 1;
    }
    if (allow != 0) { // has been overwritten by the overflow of the username.
        privilegedAction();
    }
    return 0;
}

Source: CERN Computer Security

When compiled by an old GCC version, it might produce the following assembly:

.text:0000000000400580 main proc near                          ; DATA XREF: _start+1D↑o
.text:0000000000400580
.text:0000000000400580 var_10= byte ptr -10h
.text:0000000000400580 var_4= dword ptr -4
.text:0000000000400580
.text:0000000000400580 ; __unwind {
.text:0000000000400580     push    rbp
.text:0000000000400581     mov     rbp, rsp
.text:0000000000400584     sub     rsp, 10h
.text:0000000000400588     mov     [rbp+var_4], 0
.text:000000000040058F     mov     edi, offset format          ; "Enter your username, please: "
.text:0000000000400594     mov     eax, 0
.text:0000000000400599     call    _printf
.text:000000000040059E     lea     rax, [rbp+var_10]
.text:00000000004005A2     mov     rdi, rax
.text:00000000004005A5     call    _gets
.text:00000000004005AA     lea     rax, [rbp+var_10]
.text:00000000004005AE     mov     rdi, rax
.text:00000000004005B1     call    grantAccess
.text:00000000004005B6     test    eax, eax
.text:00000000004005B8     jz      short loc_4005C1
.text:00000000004005BA     mov     [rbp+var_4], 1
.text:00000000004005C1
.text:00000000004005C1 loc_4005C1:                             ; CODE XREF: main+38↑j
.text:00000000004005C1     cmp     [rbp+var_4], 0
.text:00000000004005C5     jz      short loc_4005D1
.text:00000000004005C7     mov     eax, 0
.text:00000000004005CC     call    privilegedAction
.text:00000000004005D1
.text:00000000004005D1 loc_4005D1:                             ; CODE XREF: main+45↑j
.text:00000000004005D1     mov     eax, 0
.text:00000000004005D6     leave
.text:00000000004005D7     retn
.text:00000000004005D7 ; } // starts at 400580
.text:00000000004005D7 main endp

On opening the stack frame we can see the following picture:

By comparing the source code and disassembly, we can infer that var_10 is username and var_4 is allow. Because the code only takes the address of start of the buffer, IDA could not detect its full size and created a single byte variable. To improve it, press * on var_10 and convert it into an array of 8 bytes. We can also rename the variables to their proper names.

Because IDA shows the stack frame layout in the natural memory order (addresses increase towards the bottom), we can immediately see the problem demonstrated by the vulnerable code: the gets function has no bounds checking, so entering a long string can overflow the username buffer and overwrite the allow variable. Since the code is only checking for a non-zero value, this will bypass the check and result in the execution of the privilegedAction function. 

Frame offsets and stack variables

As mentioned above, in the stack frame view structure offsets are shown relative to the frame pointer. In some cases, like in the example above, it is an actual processor register (RBP). For example, the variable allow is placed at offset -4 from the frame pointer and this value is used  by IDA in the disassembly listing for the symbolic name instead of raw numerical offset:

.text:0000000000400580 allow= dword ptr -4
[...]
.text:0000000000400588 mov [rbp+allow], 0
[...]

By pressing # or K on the instruction, you can ask IDA to show you the instruction’s original form:

.text:0000000000400588 mov dword ptr [rbp-4], 0

Press K again to get back to the stack variable representation.

In other situations the frame pointer can be just an arbitrary location used for convenience (usually a fixed offset from the stack pointer value at function entry). This is common in binaries compiled with frame pointer omission, a common optimization technique. In such situation, IDA may use an extra delta to compensate for the stack pointer changes in different parts of function. For example, consider this function:

.text:10001030 sub_10001030 proc near                  ; DATA XREF: sub_100010B0:loc_100010E7↓o
.text:10001030
.text:10001030 LCData= byte ptr -0Ch
.text:10001030 var_4= dword ptr -4
.text:10001030
.text:10001030     sub     esp, 0Ch
.text:10001033     mov     eax, dword_100B2960
.text:10001038     push    esi
.text:10001039     mov     [esp+10h+var_4], eax
.text:1000103D     xor     esi, esi
.text:1000103F     call    ds:GetThreadLocale
.text:10001045     push    7                           ; cchData
.text:10001047     lea     ecx, [esp+14h+LCData]
.text:1000104B     push    ecx                         ; lpLCData
.text:1000104C     push    1004h                       ; LCType
.text:10001051     push    eax                         ; Locale
.text:10001052     call    ds:GetLocaleInfoA
.text:10001058     test    eax, eax
.text:1000105A     jz      short loc_1000107D
.text:1000105C     mov     al, [esp+10h+LCData]
.text:10001060     test    al, al
.text:10001062     lea     ecx, [esp+10h+LCData]
.text:10001066     jz      short loc_1000107D

Here, the explicit frame pointer (ebp) is not used, and IDA arranges the stack frame so that the return address is placed offset 0:

-00000010 ; Frame size: 10; Saved regs: 0; Purge: 0
-00000010 ;
-00000010
-00000010     db ? ; undefined
-0000000F     db ? ; undefined
-0000000E     db ? ; undefined
-0000000D     db ? ; undefined
-0000000C LCData db ?
-0000000B     db ? ; undefined
-0000000A     db ? ; undefined
-00000009     db ? ; undefined
-00000008     db ? ; undefined
-00000007     db ? ; undefined
-00000006     db ? ; undefined
-00000005     db ? ; undefined
-00000004 var_4 dd ?
+00000000  r  db 4 dup(?)
+00000004
+00000004 ; end of stack variables

To compensate for the changes of the stack pointer (sub esp, 0Ch and the push instructions), values 10h or 14h have to be added in the stack variable operands. Thanks to this, we can easily see that instructions at 10001047 and 1000105C refer to the same variable, even though in raw form they use different offsets (⁠[esp+8] and [esp+4]).

Extra information: IDA Help: Stack Variables Window