------- Comment #1 from joseph at codesourcery dot com 2008-04-11 16:58 ------- Subject: Re: New: Dubious charset conversions
On Fri, 11 Apr 2008, neil at gcc dot gnu dot org wrote: > GCC accepts the following with -ansi -pedantic -Wall without diagnostics > > #include <stdlib.h> > wchar_t z[] = L"a" "\xff"; > > GCC claims a default execution charset of UTF-8; presumably the default > execution wide character set is UTF-32. But "\xff" is a two-character narrow > execution character set string literal, with characters \xff \0, which is > invalid UTF-8 and so cannot be converted in a meaningful way to the execution > character set (whatever it is). > > I would expect the above code to be rejected, or at least diagnosed. Accepting it as equivalent to L"a\xff" (generating a wide character L'a' followed by one with value 0xff) seems in accordance with the principles of N951, the relevant ones of which are implemented in GCC. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n951.htm http://gcc.gnu.org/ml/gcc-patches/2003-07/msg00532.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35908