------- Additional Comments From joseph at codesourcery dot com  2005-02-21 
19:47 -------
Subject: Re:  UCNs not recognized in identifiers
 (c++/c99)

On Mon, 21 Feb 2005, geoffk at geoffk dot org wrote:

> > * These rules apply to identifiers as preprocessing tokens at any
> > time, including before concatenation.  So it is not the case in C99
> > that splitting an identifier anywhere yields two valid preprocessing
> > tokens: the second half could begin with a UCN for a digit and not be
> > a valid identifier.  (Invalid identifiers in C99 don't require
> > diagnostics, but I don't think we want to use this laxity.)
> 
> The second half would a pp-number, instead.  It is always true that 
> splitting an identifier between characters yields two valid 
> preprocessing tokens.

It would not be a pp-number, as a UCN for a digit is still an 
identifier-nondigit rather than a digit in terms of the syntax and 
pp-numbers can't start with identifiers-nondigits.

> > * All uses of identifiers and DECL_ASSEMBLER_NAME in the compiler
> > should be audited to determine what sort of identifier is appropriate
> > in each case.
> 
> I don't understand this sentence.  What different sorts of identifiers 
> are there, and how could they be appropriate or not appropriate?

Identifiers found in input, with input spelling.  (Input includes -D and 
-U options on the command line - in principle the command line should be 
interpreted in the user's locale by default just like source files.)

UTF-8 (or, I suppose, UTF-EBCDIC) internally encoded identifiers.

Identifiers in mangled form in any case where they are mangled for output.

Identifiers in diagnostics (possibly including cases where bits of a 
diagnostic get built up with sprintf), which need converting to the user's 
locale for display or to be displayed using UCNs.

I don't know if collect2 might also need to know something about extended 
identifiers.

The aim is that every datastructure with an identifier should have the 
encoding (input, internal, output, diagnostic) well-defined and 
conversions between these should be handled properly.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449

Reply via email to