Command line options, in particular -D options, should be interpreted in the locale character set (maybe subject to -finput-charset override). Instead, the expansion of a -D option is not subject to character set translation at present.
Consider the program char *s = S; compiled with the following command with LC_CTYPE=en_GB.ISO-8859-1 gcc -S -finput-charset=ISO-8859-1 -fexec-charset=UTF-8 -DS=\"§\" t.c - the string in the output program consists of a single byte rather than being translated to UTF-8. But the similar program, encoded in ISO-8859-1 char *s = "§"; compiled with the same options, in the same locale, has a properly UTF-8 string in the assembly output. If we get extended identifiers (bug 9449) then the same will apply to the macro names and parameter names in -D and -U options, not just their expansions. I think the -D and -U arguments should just have the same character set translations applied as are done to source files - including for C++, when it is implemented for source files, the conversion of extended characters to UCNs in phase 1. -- Summary: -D option handling doesn't account for character sets Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: preprocessor AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: jsm28 at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20183