Igor’s tip of the week #109: Hex view text encoding – Hex Rays

Written by Igor Skochinsky | Oct 6, 2022

The Hex view is used to display the contents of the database as a hex dump. It is also used during debugging to display memory contents.

By default it has a part on the right with the textual representation of the data. Usually the text part shows Latin letters or dots for unprintable characters but you may also encounter something unusual:

Why is there Chinese among English? Is it a hidden message and the binary actually comes from China?

In fact, the mystery has a very simple explanation: the encoding used for showing text data in hex view uses the database default which is usually UTF-8, so a valid UTF-8 byte sequence may decode to Chinese, Japanese, Russian, Korean, or even emoji. If you prefer to see only the plain ASCII text, you can change the encoding using these simple steps:

From the hex view’s context menu, invoke Text > Add encoding…
Enter “ascii”;
the new encoding will be added to the list and made default, so any bytes not falling into the ASCII range will be shown as unprintable:

Instead of “ascii” you can use another encoding which matches the type of binary you’re analyzing. For example, if you work with legacy Japanese software, encodings like “Shift-JIS”, “cp932” or “EUC-JP” may help you discover otherwise hidden text.