[I apologize for breaking the thread; I am currently stuck using a web-mail client that does not permit manual insertion of References: headers. Please don't take this comment as a sign that I am resuming participation in GCC development in general.]
Joseph Myers: > There are plenty of spare bits in cpp_token to flag extended identifiers > and handle them specially (as a slow path, marked as such with > __builtin_expect). There's one bit in the flags byte, two unused bytes > after it and a whole word not used in the case of identifiers (identifiers > use a cpp_hashnode * where strings and numbers use a struct cpp_string > which is bigger) Especially for C++ which constructs a cpp_token array (sort of) representing the entire translation unit, it is desirable to make cpp_token *smaller* -- and it would be relatively easy to win back that 'whole word not used in the case of identifiers', so I do not like a solution which starts using that word for identifiers. Note that identifiers and punctuators are vastly more common than numbers or strings, in all C-family languages. Geoff Keating: > > Adding salt to the wound, of course, is that for C the only difference > > between an (A) or (B) and a (C) implementation is that a (C) > > implementation is less expressive: there are some programs, all of > > which are erroneous and require a diagnostic, that can't be written. > > So you lose compiler performance just so users have another bullet > > to shoot their feet with. Joseph: > C++ requires (A) and provides examples of valid programs where it can be > told whether a normalisation of UCNs is part of the implementation-defined > phase 1 transformation. I am with Joe Buck in the opinion that even a 1% performance penalty for implementing (A) [or (B)] would be too much -- I suggest this be fixed by convincing the C++ committee to allow (C) and not just by phase 1 transformations, thus allowing the existing implementation to conform. zw