Hello- I thought I might ping this short patch please, just in case it may make sense to include in GCC 10 along with the other UTF-8-related fixes to diagnostics. Thanks!
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00915.html -Lewis On Thu, Dec 12, 2019 at 6:21 PM Lewis Hyatt <lhy...@gmail.com> wrote: > > Hello- > > In the original discussion of implementing UTF-8 identifiers > ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224#c26 ), I pointed out > that colorization would corrupt the appearance of certain diagnostics. For > example, this code, with -std=c99: > > ---------- > int ٩x; > ---------- > > Produces: > > t2.cpp:1:5: error: extended character ٩ is not valid at the start of an > identifier > 1 | int ٩x; > | ^ > > The diagnostic location contains only the first byte of the character, so > when colorization is enabled, the ANSI escapes are inserted in the middle > of the UTF-8 sequence and produce corrupted output on the terminal. > > I feel like there are two separate issues here: > > #1. diagnostic_show_locus() should be sure it will not corrupt output in > this way, regardless of what ranges it is given to work with. > > #2. libcpp should probably generate a range that includes the whole UTF-8 > character. Actually in other ways the range seems not ideal, for example > if an invalid character appears in the middle of the identifier, the > diagnostic still points to the first byte of the identifier. > > The attached patch fixes #1. It's essentially a one-line change, plus a > new selftest. Would you please have a look at it sometime? bootstrap > and testsuite were done on linux x86-64. > > Other questions that I have: > > - I am not quite clear when a selftest is preferred vs a dejagnu test. In > this case I stuck with the selftest because color diagnostics don't seem > to work well with dg-error etc, and it didn't seem worth creating a new > plugin-based test like g++.dg/plugin just for this. (I also considered > using the existing g++.dg plugin, but it seems this test should run for > gcc as well.) > > - I wasn't sure if I should create a PR for an issue such as this, if > there is already a patch readily available. And if I did create a PR, > not sure if it's preferred to post the patch to gcc-patches, or as an > attachment to the PR. > > - Does it seem worth me looking into #2? I think the patch to address #1 is > appropriate in any case, because it handles generically all potential > cases where this may arise, but still perhaps the ranges coming out of > libcpp could be improved? > > Thanks... > > -Lewis