We’ve mentioned operand representation before but today we’ll use a specific one to find the Easter egg hidden in the post #85.
More specifically, it was this screenshot:
The function surprise
calls printf
, but the arguments being passed to it seem to all be numbers. Doesn’t printf()
usually work with strings? What’s going on?
As you probably know, computers do not actually distinguish numbers from characters – to them they’re all just a set of bits. So it’s all a matter of interpretation or representation. For example, all of the following are represented by the same bit pattern:
65
(decimal number)0x41
, 41h
, H'41
(hexadecimal number)0101
or 101o
(octal number)1000001b
or 0b1000001
(binary number)'A'
(ASCII character)WM_COMPACTING
(Win32 API constant)In fact, listing in the screenshot has been modified from the defaults to make the Easter egg less obvious. Here’s the original version as text:
.text:00401010 ; int surprise(...) .text:00401010 _surprise proc near ; CODE XREF: _main↑p .text:00401010 .text:00401010 var_24= dword ptr -24h .text:00401010 var_20= dword ptr -20h .text:00401010 _Format= byte ptr -1Ch .text:00401010 var_18= dword ptr -18h .text:00401010 var_14= dword ptr -14h .text:00401010 var_10= dword ptr -10h .text:00401010 var_C= dword ptr -0Ch .text:00401010 var_8= dword ptr -8 .text:00401010 var_4= dword ptr -4 .text:00401010 .text:00401010 sub esp, 24h .text:00401013 mov eax, ___security_cookie .text:00401018 xor eax, esp .text:0040101A mov [esp+24h+var_4], eax .text:0040101E lea eax, [esp+24h+var_24] .text:00401021 mov dword ptr [esp+24h+_Format], 70747468h .text:00401029 push eax .text:0040102A lea eax, [esp+28h+_Format] .text:0040102E mov [esp+28h+var_18], 2F2F3A73h .text:00401036 push eax ; _Format .text:00401037 mov [esp+2Ch+var_14], 2D786568h .text:0040103F mov [esp+2Ch+var_10], 73796172h .text:00401047 mov [esp+2Ch+var_C], 6D6F632Eh .text:0040104F mov [esp+2Ch+var_8], 73252Fh .text:00401057 mov [esp+2Ch+var_24], 74736165h .text:0040105F mov [esp+2Ch+var_20], 7265h .text:00401067 call _printf .text:0040106C mov ecx, [esp+2Ch+var_4] .text:00401070 add esp, 8 .text:00401073 xor ecx, esp ; StackCookie .text:00401075 xor eax, eax .text:00401077 call @__security_check_cookie@4 ; __security_check_cookie(x) .text:0040107C add esp, 24h .text:0040107F retn .text:0040107F _surprise endp
In hexadecimal it’s almost immediately obvious: the “numbers” are actually short fragments of ASCII text. The code is building strings on the stack piece by piece. This can be made more explicit by converting numbers to the character operand type (shortcut R).
To help you decide whether such operand type makes sense, IDA shows a preview in the context menu:
This way it’s pretty clear that the “number” is actually a text fragment. After converting all “numbers” to character constant, a pattern begins to emerge:
Due to the little-endian memory organization of the x86 processor family, the individual fragments have to be read backwards (i.e. character literal 'ptth'
corresponds to the string fragment "http"
).
Now it’s almost obvious what the result is supposed to be but there’s in fact an even easier way to discover it.
Because the approach of processing short strings in register-sized chunks is often used by compilers to implement common C runtime functions inline instead of calling the library function, the decompiler uses heuristics to detect such code patterns and show them as equivalent function calls again. If we decompile this function, the decompiler reassembles the strings and shows them as if they were like that in the pseudocode:
Malware often uses a similar approach of building strings by small pieces (most often character by character) on the stack because this way the complete string does not appear in the binary and can’t be discovered by simply searching for it. Thanks to the automatic comments shown by IDA for operands not having explicitly assigned type, they are usually obvious in the disassembly:
And the decompiler easily recovers the complete string:
void __noreturn start() { char v0[36]; // [esp+0h] [ebp-28h] BYREF qmemcpy(v0, "FLAG{STACK-STRINGS-ARE-BEST-STRINGS}", sizeof(v0)); [...] }
P.S. If you want to play with the Easter egg binary and reproduce the results in this post, download it here:easter2022.zip