Strings in binaries are very useful for the reverse engineer: they often contain messages shown to the user, or sometimes even internal debugging information (function or variable names) and so having them displayed in the decompiled code is very helpful.
However, sometimes you may see named variables in pseudocode even though the disassembly shows the string nicely. Why does this happen and how to fix it?
When deciding whether to display a string literal inline, the main criteria are attributes of the memory area it resides in. If the memory is writable, it means that the string is not really constant but may change, so displaying a variable name is more correct. For example, here’s the default pseudocode of a function from a decompressed Linux kernel:
We can see a string literal is displayed as a variable name (aApicIcrReadRet
) even though it is a nice-looking string in the disassembly. The mystery can be cleared up if we jump to its definition (e.g. by double-clicking) and inspect the segment properties (Edit > Segment > Edit Segment…, or Alt–S). We can see that the segment is marked as writable:
Why does .rodata
(“read-only data”) have write permissions? We can’t say for sure, but the section does include this flag in the ELF headers:
(readelf
output)
Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000940 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS ffffffff81000000 00001000 0000000000628281 0000000000000000 AX 0 0 4096 [ 2] .notes NOTE ffffffff81628284 00629284 0000000000000204 0000000000000000 AX 0 0 4 [ 3] __ex_table PROGBITS ffffffff81628490 00629488 0000000000002cdc 0000000000000000 A 0 0 4 [ 4] .rodata PROGBITS ffffffff81800000 0062d000 0000000000275332 0000000000000000 WA 0 0 4096 <...skipped...> Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific)
One possibility is that it is made actually read-only later in the boot process.
So one solution for our problem is to make sure that the segment has only Read (and possibly Execute) permissions but not Write. If you do that, the string literals from that segment will be displayed inline:
While changing segment attributes works, it may not be suitable for all cases. For example, some compilers can put string constants in the same section as other writable data, so if you change the segment permissions to read-only, the decompiler could produce wrong output for functions using the writable variables. You may also have an opposite situation: a string constant is not actually constant but simply has a default value, so it needs to be marked as variable. In such cases, you can override the attributes of each string variable using const
or volatile
type attributes. For example, instead of changing the whole segment’s permission, you could edit the type of the aApicIcrReadRet
variable by pressing Y (change type) and changing its type to const char aApicIcrReadRet[]
.
With this option, only the edited strings literals will be shown inline and others remain as variables.
Yet another possibility is to rely on IDA’s analysis of disassembly and show all strings marked as string literals on the disassembly level. This can be done in the decompiler options ( Edit > Plugins > Hex-Rays Decompiler, Options, Analysis Options 1) by turning off “Print only constant string literals” option.
To change this option for all future databases, see the HO_CONST_STRINGS
option in hexrays.cfg
.
For more info see the decompiler manual: