------- Additional Comments From geoffk at gcc dot gnu dot org 2005-01-08 02:20 ------- So, just to be clear on this, the translation unit:
const char * \u00c5 = "a-ring"; float \u212b = 1e-10; 1. Is a valid translation unit in C99? 2. Invokes undefined behaviour? 3. Requires a diagnostic? Logically it can only be one of the three. I think the standard is pretty clear that it's (1); 6.4.2.1 paragraph 6, "Any identifiers that differ in a significant character are different identifiers." The standard therefore prohibits a compiler converting unicode sequences specified with \u to NFC (or any other normal form). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449