http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59873
--- Comment #5 from Wesley J. Landaker <wjl at icecavern dot net> --- (In reply to Marc Glisse from comment #4) > Seems to be on purpose, see the comment before _cpp_valid_ucn in > libcpp/charset.c, and the last instruction in that function. > > [lex.charset] is a bit hard to read for me. If I'm reading that comment right, it sounds like the C++11 standard says that something like: U'\u0000' should yield a compiler error, like it currently does with U'\ud800' (a surrogate), instead of silently working in an unexpected manner. Assuming this line of reasoning is correct, my second test program (the char32_literal_test.c++) shows that gcc has a bug in that it does not propertly *reject* any invalid \uXXXX or \UXXXXXXXX except for surrogates. (As an aside, if this really does violate the C++11 standard, clang has this same bug -- it just behaved in the way I naively expected it to.)