Halvar and Dennis Elser recently blogged about a serious vulnerability in the ATL libraries. A few days ago, Microsoft released an emergency “out-of-band” patch. Yes, the bug was that nasty, and since it is in a library, many MS Windows components were affected. Everyone who used the library should review their code and recompile with the corrected version.
Microsoft posted more details about the vulnerability. As it turns out, it was quite difficult to catch the bug. Automatic tools, like source code scanners and fuzzers, could not detect it. The very bug consists of a small typo, and it is understandable that it could be missed during code reviews. Humans tend to skip punctuation marks and focus on “important” stuff. However, everything is important in the source code, even spacing.
Source code analysis tools and the compiler could not catch the bug because of a cast in the source code.
Casts are a way to tell the compiler that an object of one kind should be treated as something different. For example, we have a 4 byte object and we can tell the compiler to treat it as a 2 byte object. The opposite is also possible: a 2 byte object can be treated as a 4 byte object. Naturally, when we use a cast, all responsibility is on us. A cast turns off all compiler checks and leaves us face-to-face with the cruel real world of bits and bytes. If we tell the compiler that the object is bigger than it is, all kinds of odd things can happen, including security breaches.
Since casts are that powerful, they should be used very sparingly. My attitude is to use them only there is no other choice. I know that my position differs from the mainstream. Many textbooks teach to use casts to “document” the code. Like, if a function expects a LONG parameter, they advise cast an uint variable to LONG. This is wrong. Let us check the vulnerable code, it is literally ridden with such casts:
__int64 cbSize; hr = pStream->Read((void*) &cbSize, sizeof(cbSize), NULL); BYTE *pbArray; HRESULT hr = SafeArrayAccessData(psa, reinterpret_cast<LPVOID *>(&pbArray)); hr = pStream->Read((void*)&pbArray, (ULONG)cbSize, NULL);
This is a really bad code snippet. It illustrates how casts are overused nowadays, which is a bad coding practice. 100% of statements above have casts; while the ratio of statements with casts to statements without casts should be 1:100 or less. Here’s the same code, with all casts removed (and the erroneous & removed):
__int64 cbSize; hr = pStream->Read(&cbSize, sizeof(cbSize), NULL); BYTE *pbArray; HRESULT hr = SafeArrayAccessData(psa, &pbArray); hr = pStream->Read(pbArray, cbSize, NULL);
I did not try to compile it but it should pass all compiler checks and behave the same way. For me, the version without casts is shorter, simpler, easier to read. Why use any casts at all?
It is true that removing some casts leads to compiler warnings. Assignments that may truncate values, or comparisons of signed/unsigned integers are marked by many compilers as problematic. I would personally just turn off these warnings instead of damaging the source code with casts.
Casts to void* have no raison d’etre at all. I always wonder why anyone would cast an address expression to void*.
It even gets funny. Some programmers go as far as marking function calls with void:
(void)func();
This is their way of telling the reader that the return value of the function is not used. Well, isn’t the following simpler?
func();
C/C++ are not verbose languages after all.
Well, to keep it short, I think the current practices of using casts should be revised. I would opt for using casts only when it is absolutely necessary. By the way, our decompiler generates only necessary casts, with the exception of floating point-integer casts, which are always generated. In some cases even necessary casts are undesired (they obscure the program logic), so there is a command to temporarily hide all casts.
Keeping the source code simpler and shorter will allow us to avoid bugs like this: https://www.microsoft.com/technet/security/bulletin/ms09-034.mspx. Less casts also mean that source code analysis tools will perform better.
Less casts – better code.