Martin: on the way back from Cauldron I had a go at implementing new output formats for the warnings from gimple-ssa-sprintf.c, to try to better indicate to the end user what the problem is. My plan was to implement some of the ASCII art ideas we've been discussing. However, to try to understand what the pass is doing, I tried visualizing the existing data structures, and I ended up liking the output of that so much, that I think that it is what we ought to show the end-user.
Consider the examples from PR middle-end/77696: char d[4]; typedef __SIZE_TYPE__ size_t; extern int sprintf (char*, const char*, ...); extern int snprintf (char*, size_t, const char*, ...); void f (int i) { // bounded, definite truncation in a directve snprintf (d, sizeof d, "%i", 1235); // bounded, definite truncation copying format string snprintf (d, sizeof d, "%iAB", 123); // unbounded, definite overflow in a directve sprintf (d, "%i", 1235); // unbounded, definite overflow copying format string sprintf (d, "%iAB", 123); // bounded, possible truncation a directve snprintf (d, sizeof d, "%i", i); // bounded, possible overflow copying format string snprintf (d, sizeof d, "%iAB", i); // unbounded, possible overflow in a directve sprintf (d, "%i", i); // unbounded, possible overflow copying format string sprintf (d, "%iAB", 123); // unbounded, possible overflow copying format string const char *s = i ? "123" : "1234"; sprintf (d, "%sAB", s); } extern char buf_10[80]; extern char tmpdir[80]; extern char fname[8]; void g (int num) { sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num); } trunk currently emits: zz.c: In function ‘f’: zzz.c:12:29: warning: ‘snprintf’ output truncated before the last format character [-Wformat-truncation=] 12 | snprintf (d, sizeof d, "%i", 1235); | ^ zzz.c:12:3: note: ‘snprintf’ output 5 bytes into a destination of size 4 12 | snprintf (d, sizeof d, "%i", 1235); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ zzz.c:15:30: warning: ‘AB’ directive output truncated writing 2 bytes into a region of size 1 [-Wformat-truncation=] 15 | snprintf (d, sizeof d, "%iAB", 123); | ~^ zzz.c:15:3: note: ‘snprintf’ output 6 bytes into a destination of size 4 15 | snprintf (d, sizeof d, "%iAB", 123); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ zzz.c:18:18: warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow=] 18 | sprintf (d, "%i", 1235); | ^ zzz.c:18:3: note: ‘sprintf’ output 5 bytes into a destination of size 4 18 | sprintf (d, "%i", 1235); | ^~~~~~~~~~~~~~~~~~~~~~~ zzz.c:21:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=] 21 | sprintf (d, "%iAB", 123); | ~^ zzz.c:21:3: note: ‘sprintf’ output 6 bytes into a destination of size 4 21 | sprintf (d, "%iAB", 123); | ^~~~~~~~~~~~~~~~~~~~~~~~ zzz.c:33:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1 [-Wformat-overflow=] 33 | sprintf (d, "%iAB", 123); | ~^ zzz.c:33:3: note: ‘sprintf’ output 6 bytes into a destination of size 4 33 | sprintf (d, "%iAB", 123); | ^~~~~~~~~~~~~~~~~~~~~~~~ zzz.c:37:19: warning: ‘AB’ directive writing 2 bytes into a region of size between 0 and 1 [-Wformat-overflow=] 37 | sprintf (d, "%sAB", s); | ~^ zzz.c:37:3: note: ‘sprintf’ output between 6 and 7 bytes into a destination of size 4 37 | sprintf (d, "%sAB", s); | ^~~~~~~~~~~~~~~~~~~~~~ zzz.c: In function ‘g’: zzz.c:46:24: warning: ‘/’ directive writing 1 byte into a region of size between 0 and 79 [-Wformat-overflow=] 46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num); | ^ zzz.c:46:3: note: ‘sprintf’ output between 9 and 105 bytes into a destination of size 80 46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ With this patch kit, gcc emits: zzz.c: In function ‘f’: zzz.c:12:13: warning: ‘snprintf’ will truncate the output (5 bytes) to size 4 [-Wformat-truncation=] 12 | snprintf (d, sizeof d, "%i", 1235); | ^ ^~^ | | | | | | | 1 byte (for NUL terminator) | | 4 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c:15:13: warning: ‘snprintf’ will truncate the output (6 bytes) to size 4 [-Wformat-truncation=] 15 | snprintf (d, sizeof d, "%iAB", 123); | ^ ^~^~^ | | | | | | | | | 1 byte (for NUL terminator) | | | 2 bytes | | 3 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c:18:12: warning: buffer overflow: ‘sprintf’ will write 5 bytes into a destination of size 4 [-Wformat-overflow=] 18 | sprintf (d, "%i", 1235); | ^ ^~^ | | | | | | | 1 byte (for NUL terminator) | | 4 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c:21:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=] 21 | sprintf (d, "%iAB", 123); | ^ ^~^~^ | | | | | | | | | 1 byte (for NUL terminator) | | | 2 bytes | | 3 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c:33:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a destination of size 4 [-Wformat-overflow=] 33 | sprintf (d, "%iAB", 123); | ^ ^~^~^ | | | | | | | | | 1 byte (for NUL terminator) | | | 2 bytes | | 3 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c:37:12: warning: buffer overflow: ‘sprintf’ will write between 6 and 7 bytes into a destination of size 4 [-Wformat-overflow=] 37 | sprintf (d, "%sAB", s); | ^ ^~^~^ | | | | | | | | | 1 byte (for NUL terminator) | | | 2 bytes | | 3...4 bytes | capacity: 4 bytes zzz.c:1:6: note: destination declared here 1 | char d[4]; | ^ zzz.c: In function ‘g’: zzz.c:46:12: warning: buffer overflow: ‘sprintf’ will write between 9 and 105 bytes into a destination of size 80 [-Wformat-overflow=] 46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num); | ^~~~~~ ^^~^^~^^~^~~~^ | | || || || | | | | || || || | 1 byte (for NUL terminator) | | || || || 4 bytes | | || || |1...11 bytes | | || || 1 byte | | || |0...7 bytes | | || 1 byte | | |0...79 bytes | | 1 byte | capacity: 80 bytes zzz.c:40:13: note: destination declared here 40 | extern char buf_10[80]; | ^~~~~~ The above is missing colorization, which can be seen at: https://dmalcolm.fedorapeople.org/gcc/2018-09-14/zzz.html where the secondary locations within the format string alternate colors between odd and even secondary locations. The above output format is rather verbose, but I think that it is appropriate for this warning. Rationale: What is the action that an end-user will want to take when encountering a warning? I think they will want to: (a) review it and decide: (a.1) "Is this a "true" result?" (a.2) "Do I care?" (b) if they decide it's genuine: "How do I fix this?" (i.e. we should aim for our diagnostics to be "actionable") Consider the sprintf/snprintf warnings. For the user to easily take the above actions, the most helpful thing we can do is to provide the user with clear information on how much data the format directives will write when the code runs, and to compare that with the size of the destination buffer. I think the typical fix the user will apply will be to increase the size of the buffer, or to convert sprintf to snprintf: a buffer overflow is typically very bad, whereas truncating a string can often be much less serious. A less common fix will be to add or tweak width specifiers to specific format directives. To help the end-user to make these kinds of decisions, these warnings need to clearly summarize the sizes of *all* of the various writes, and compare them to the size of the destination buffer. As I understand it, the current implementation iterates through the format directives, tracking the possible ranges of data that could have been written so far. It emits a warning at the first directive that can have an overflow, and then stops analyzing the directives at that callsite. The existing approach thus seems to focus on where the first overflow can happen. My proposed new approach analyzes all of the directives up-front, and if an overflow can occur, then the warning that's emitted shows all of the various sizes of writes that will happen. Hence the reimplementation doesn't bother showing the point where the overflow happens; instead it summarizes all of what will be written, to help the user in deciding how to fix their code. (Doing so reduces some of the combinatorial explosion in possible output messages). The patch also puts the words "buffer overflow" at the front of the message where appropriate (to better grab the user's attention). It also calls out NUL terminators, since people tend to make mistakes with them. There's a lot of FIXMEs in the resulting patch, but I wanted to run the idea past you before fixing all the FIXMEs. In particular, I haven't attempted to bootstrap and regression-test the new implementation. (I suspect many of the existing tests are broken by this...). Consider this a prototype. [the interesting part of the patch kit is in patch 5/5] Thoughts? Dave David Malcolm (5): substring-locations: add class format_string_diagnostic_t Add range_idx param to range_label::get_text gimple-ssa-sprintf.c: move struct call_info out of the dom_walker subclass Add pp_humanized_limit and pp_humanized_range gimple-ssa-sprintf.c: rewrite overflow/truncation diagnostic (PR middle-end/77696) gcc/c-family/c-format.c | 32 +- gcc/c/c-objc-common.c | 2 +- gcc/c/c-typeck.c | 4 +- gcc/cp/error.c | 2 +- gcc/diagnostic-show-locus.c | 19 +- gcc/gcc-rich-location.h | 4 +- gcc/gimple-ssa-sprintf.c | 670 ++++++++++++--------------- gcc/pretty-print.c | 181 ++++++++ gcc/pretty-print.h | 4 + gcc/substring-locations.c | 113 +++-- gcc/substring-locations.h | 74 +-- gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c | 252 ++++++++++ gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c | 57 +++ libcpp/include/line-map.h | 6 +- 14 files changed, 927 insertions(+), 493 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-1.c create mode 100644 gcc/testsuite/gcc.dg/sprintf-diagnostics-2.c -- 1.8.5.3