Re: [PATCH][RFC] diagnostics: Add support for Unicode drawing characters

Martin Sebor via Gcc-patches Thu, 23 Jul 2020 13:53:26 -0700

On 7/23/20 10:28 AM, Lewis Hyatt via Gcc-patches wrote:

Hello-


The attached patch is complete including docs, but I tagged as RFC
because I am not sure if anyone will like it, or if the general reaction may
be closer to recoiling in horror :). Would appreciate your thoughts,
please...


I don't have much of an opinion on the proposed changes but they
remind me of an enhancement I have been thinking about for a while.
I think it would be a nice touch to more finely differentiate parts
of the message text from the rest than just by highlighting it in
bold.  Specifically, I'm thinking of terms of a language grammar,
but a similar approach could be used for other elements as well.

For example, a number of diagnostic messages refer to the term
constant-expression.  A common convention used by language
standards is to render these terms in italics.  Doing the same
in GCC output would make it clear when it refers to the term of
the grammar (especially in non-hyphenated terms).  Since some
terminals support italics even without UTF-8, this enhancement
could be made independently.

Other font characteristics could be used to differentiate other
"elements" referenced in the messages, such as numerical constants
from ordinary numbers, highlight especially relevant parts of quoted
text like option arguments (for instance, the 9 in "alignment of
%qD will increase in %<-fabi-version=9%>") or attribute arguments
that cannot be underscored, or even be used in hints (e.g.,
strikethrough to denote deletion and underline for insertion).

Martin


Currently, if a UTF-8 locale is detected, GCC changes the quote characters
it outputs in diagnostics to Unicode directional quotes. I feel like this is
a nice touch, so I was wondering whether GCC shouldn't do more along these
lines. This patch adds support for using Unicode line drawing characters and
similar things when outputting diagnostics. There is a new option
-fdiagnostics-unicode-drawing=[auto|never|always] to control it, which
defaults to auto. "auto" will enable the feature under the same
circumstances that Unicode quotes get output, namely when the locale is
determined by gcc_init_libintl() to support UTF-8. (The new option does not
affect Unicode quote characters, which currently are not configurable and
are determined solely by the locale.)

The elements implemented are:

     * Vertical lines, e.g. those indicating labels and those separating the
       source lines from the line numbers, are changed to line drawing
       characters.

     * The diagnostic paths output by the static analyzer make use of line
       drawing characters to output smooth corners etc.

     * The squiggly underline ~~~~~ used to highlight source locations is
       changed to a double underline ═════. The main reason for this is that
       it enables a seamless "tee" character to connect the underline to a
       label line if one exists.

     * Carets (^) are changed to a slightly different character (∧). I think
       the new one is a little nicer looking, although probably not worth the
       trouble on its own. I wanted to implement the support in this patch
       beause carets are harder to change than the rest of the elements
       (front ends have an interface to override them, which currently
       Fortran makes use of), so I thought it worthwhile to get this logic in
       place, so that it can easily be changed to a more superior character
       in the future if one comes up. It would also be easy enough to leave
       the Unicode support in place for carets, but keep the default set to
       the plain one for now.

As an example, this diagnostic from gcc.dg/format/diagnostic-ranges.c:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
   196 |   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
       |                           ~^~~    ~~~~~~~~~
       |                            |          |
       |                            int        long int

would become instead:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
   196 │   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
       │                           ═∧══    ════╤════
       │                            │          │
       │                            int        long int

Hopefully you are viewing this in a terminal that displays it properly :), in
which case, hopefully you may find it to be an improvement?

Here is a more involved example from the analyzer:

setjmp-5.c: In function ‘outer’:
setjmp-5.c:21:3: warning: ‘longjmp’ called after enclosing function of ‘setjmp’ 
has returned [-Wanalyzer-stale-setjmp-buffer]
    21 |   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
       |   ^~~~~~~~~~~~~~~~~
   ‘outer’: events 1-2
     |
     |   15 | void outer (void)
     |      |      ^~~~~
     |      |      |
     |      |      (1) entry to ‘outer’
     |......
     |   19 |   inner ();
     |      |   ~~~~~~~~
     |      |   |
     |      |   (2) calling ‘inner’ from ‘outer’
     |
     +--> ‘inner’: event 3
            |
            |   10 | static void inner (void)
            |      |             ^~~~~
            |      |             |
            |      |             (3) entry to ‘inner’
            |
          ‘inner’: event 4
            |
            |   12 |   SETJMP (env);
            |      |   ^~~~~~
            |      |   |
            |      |   (4) ‘setjmp’ called here
            |
     <------+
     |
   ‘outer’: events 5-6
     |
     |   19 |   inner ();
     |      |   ^~~~~~~~
     |      |   |
     |      |   (5) returning to ‘outer’ from ‘inner’
     |   20 |
     |   21 |   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
     |      |   ~~~~~~~~~~~~~~~~~
     |      |   |
     |      |   (6) here
     |

would become instead:

setjmp-5.c: In function ‘outer’:
setjmp-5.c:21:3: warning: ‘longjmp’ called after enclosing function of ‘setjmp’ 
has returned [-Wanalyzer-stale-setjmp-buffer]
    21 │   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
       │   ∧════════════════
   ‘outer’: events 1-2
     │
     │   15 │ void outer (void)
     │      │      ∧════
     │      │      │
     │      │      (1) entry to ‘outer’
     │......
     │   19 │   inner ();
     │      │   ╤═══════
     │      │   │
     │      │   (2) calling ‘inner’ from ‘outer’
     │
     └──> ‘inner’: event 3
            │
            │   10 │ static void inner (void)
            │      │             ∧════
            │      │             │
            │      │             (3) entry to ‘inner’
            │
          ‘inner’: event 4
            │
            │   12 │   SETJMP (env);
            │      │   ∧═════
            │      │   │
            │      │   (4) ‘setjmp’ called here
            │
     ┌<─────┘
     │
   ‘outer’: events 5-6
     │
     │   19 │   inner ();
     │      │   ∧═══════
     │      │   │
     │      │   (5) returning to ‘outer’ from ‘inner’
     │   20 │
     │   21 │   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
     │      │   ╤════════════════
     │      │   │
     │      │   (6) here
     │


Although probably premature, bootstrap and regtest were done on x86-64
linux, all tests the same before/after and new tests passing:
FAIL 96 96
PASS 479090 479239
UNSUPPORTED 11946 11946
UNTESTED 194 194
XFAIL 1839 1839
XPASS 36 36

I tried to set this up as a general framework, at least, it is easy in one
place to change the characters that are used for various contexts, so that
if people like the general idea, but not some of the specifics, the patch is
easily modified for that now or in the future. Thanks for any feedback!

-Lewis

Re: [PATCH][RFC] diagnostics: Add support for Unicode drawing characters

Reply via email to