A few days ago I was working on the x86 IDA module. The goal
was to have it recognize jump tables for 64-bit processors.
This is routine: we have to add new instruction idioms to the
analysis engine from time to time to keep up with new compilers.
I was typing in the patterns and hoping
that the tests would go smoothly at the first run.
But one of the patterns puzzled me. It didn’t look good. Such
a code could not run and would randomly crash. The reason was that the processor was using
a register without fully initializing it. Yet I knew that the code worked since it came
from a real world application. Besides the code was compiler-generated and
such code is usually very robust.
The code was using the movzx instruction to copy a value from one register
to another. Something like this:
movzx eax, bl
Here the value in the bl register (8bits) is copied to the eax register (32bits).
The upper bits 24 bits of the eax register are set to zeroes during the copy.
After that the code was using the rax register (64-bit):
mov eax, offset[rcx+rax*4]
However, the high 32bits of the rax register are not initialized and may contain anything!
Code like this is doomed to crash… how come it works?!
I think you guessed it: the movzx instruction initializes the whole rax register.
Its companion instruction, movsx, behaves even more strangely. For example,
if rax=-1 and bl=0x80, after the execution of
movsx eax, bl
rax is equal to 0x00000000FFFFFF80.
Igor Skochinsky solved this mystery for me. It turns out that the results of
all 32 bit computations in the 64 bit mode are silently zero extended to 64 bits.
(note for the future: always read the manuals from the first to the last page! 😉
I don’t know why 32 bit destinations are singled out (16 bit and 8 bit results
are not zero extended), but it is nice to know about this particularity of x86 processors.