On Wed, Nov 11, 2020 at 09:33:00AM +0100, Stefan Kanthak wrote: > Ouch: that's but not the point here; what matters is the undefined behaviour > of > ((u) & 0x000000ff) << 24 > > 0x000000ff is a signed int, so (u) & 0x000000ff is signed too -- and producing > a negative value (or overflow) from the left-shift of a signed int, i.e. > shifting into (or beyond) the sign bit, is undefined behaviour!
Only in some language dialects. It is caught by -fsanitize=shift. In C++20, if the shift count is within bounds, all signed as well as unsigned left shifts well defined. In C99/C11 there is one extra rule: For signed x << y, in C99/C11, the following: (unsigned) x >> (uprecm1 - y) if non-zero, is undefined. and for C++11 to C++17 another one: /* For signed x << y, in C++11 and later, the following: x < 0 || ((unsigned) x >> (uprecm1 - y)) > 1 is undefined. */ So indeed, 0x80 << 24 is UB in C99/C11 and C++98, unclear in C89 and well defined in C++11 and later. I don't know if C2X is considering mandating two's complement and making it well defined like C++20 did. Guess we should fix that, though because different languages have different rules, GCC itself except for sanitization doesn't consider it UB and only treats shifts by negative value or shifts by bitsize or more UB. Jakub