Martin: on the way back from Cauldron I had a go at implementing new
output formats for the warnings from gimple-ssa-sprintf.c, to try to
better indicate to the end user what the problem is. My plan was to
implement some of the ASCII art ideas we've been discussing. However,
to try to understand what the pass is doing, I tried visualizing the
existing data structures, and I ended up liking the output of that so much,
that I think that it is what we ought to show the end-user.
Consider the examples from PR middle-end/77696:
char d[4];
typedef __SIZE_TYPE__ size_t;
extern int sprintf (char*, const char*, ...);
extern int snprintf (char*, size_t, const char*, ...);
void f (int i)
{
// bounded, definite truncation in a directve
snprintf (d, sizeof d, "%i", 1235);
// bounded, definite truncation copying format string
snprintf (d, sizeof d, "%iAB", 123);
// unbounded, definite overflow in a directve
sprintf (d, "%i", 1235);
// unbounded, definite overflow copying format string
sprintf (d, "%iAB", 123);
// bounded, possible truncation a directve
snprintf (d, sizeof d, "%i", i);
// bounded, possible overflow copying format string
snprintf (d, sizeof d, "%iAB", i);
// unbounded, possible overflow in a directve
sprintf (d, "%i", i);
// unbounded, possible overflow copying format string
sprintf (d, "%iAB", 123);
// unbounded, possible overflow copying format string
const char *s = i ? "123" : "1234";
sprintf (d, "%sAB", s);
}
extern char buf_10[80];
extern char tmpdir[80];
extern char fname[8];
void g (int num)
{
sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
}
trunk currently emits:
zz.c: In function ‘f’:
zzz.c:12:29: warning: ‘snprintf’ output truncated before the last format
character [-Wformat-truncation=]
12 | snprintf (d, sizeof d, "%i", 1235);
| ^
zzz.c:12:3: note: ‘snprintf’ output 5 bytes into a destination of size 4
12 | snprintf (d, sizeof d, "%i", 1235);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:15:30: warning: ‘AB’ directive output truncated writing 2 bytes into a
region of size 1 [-Wformat-truncation=]
15 | snprintf (d, sizeof d, "%iAB", 123);
| ~^
zzz.c:15:3: note: ‘snprintf’ output 6 bytes into a destination of size 4
15 | snprintf (d, sizeof d, "%iAB", 123);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:18:18: warning: ‘sprintf’ writing a terminating nul past the end of the
destination [-Wformat-overflow=]
18 | sprintf (d, "%i", 1235);
| ^
zzz.c:18:3: note: ‘sprintf’ output 5 bytes into a destination of size 4
18 | sprintf (d, "%i", 1235);
| ^~~~~~~~~~~~~~~~~~~~~~~
zzz.c:21:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1
[-Wformat-overflow=]
21 | sprintf (d, "%iAB", 123);
| ~^
zzz.c:21:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
21 | sprintf (d, "%iAB", 123);
| ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:33:19: warning: ‘AB’ directive writing 2 bytes into a region of size 1
[-Wformat-overflow=]
33 | sprintf (d, "%iAB", 123);
| ~^
zzz.c:33:3: note: ‘sprintf’ output 6 bytes into a destination of size 4
33 | sprintf (d, "%iAB", 123);
| ^~~~~~~~~~~~~~~~~~~~~~~~
zzz.c:37:19: warning: ‘AB’ directive writing 2 bytes into a region of size
between 0 and 1 [-Wformat-overflow=]
37 | sprintf (d, "%sAB", s);
| ~^
zzz.c:37:3: note: ‘sprintf’ output between 6 and 7 bytes into a destination of
size 4
37 | sprintf (d, "%sAB", s);
| ^~~~~~~~~~~~~~~~~~~~~~
zzz.c: In function ‘g’:
zzz.c:46:24: warning: ‘/’ directive writing 1 byte into a region of size
between 0 and 79 [-Wformat-overflow=]
46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
| ^
zzz.c:46:3: note: ‘sprintf’ output between 9 and 105 bytes into a destination
of size 80
46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With this patch kit, gcc emits:
zzz.c: In function ‘f’:
zzz.c:12:13: warning: ‘snprintf’ will truncate the output (5 bytes) to size 4
[-Wformat-truncation=]
12 | snprintf (d, sizeof d, "%i", 1235);
| ^ ^~^
| | | |
| | | 1 byte (for NUL terminator)
| | 4 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c:15:13: warning: ‘snprintf’ will truncate the output (6 bytes) to size 4
[-Wformat-truncation=]
15 | snprintf (d, sizeof d, "%iAB", 123);
| ^ ^~^~^
| | | | |
| | | | 1 byte (for NUL terminator)
| | | 2 bytes
| | 3 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c:18:12: warning: buffer overflow: ‘sprintf’ will write 5 bytes into a
destination of size 4 [-Wformat-overflow=]
18 | sprintf (d, "%i", 1235);
| ^ ^~^
| | | |
| | | 1 byte (for NUL terminator)
| | 4 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c:21:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a
destination of size 4 [-Wformat-overflow=]
21 | sprintf (d, "%iAB", 123);
| ^ ^~^~^
| | | | |
| | | | 1 byte (for NUL terminator)
| | | 2 bytes
| | 3 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c:33:12: warning: buffer overflow: ‘sprintf’ will write 6 bytes into a
destination of size 4 [-Wformat-overflow=]
33 | sprintf (d, "%iAB", 123);
| ^ ^~^~^
| | | | |
| | | | 1 byte (for NUL terminator)
| | | 2 bytes
| | 3 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c:37:12: warning: buffer overflow: ‘sprintf’ will write between 6 and 7
bytes into a destination of size 4 [-Wformat-overflow=]
37 | sprintf (d, "%sAB", s);
| ^ ^~^~^
| | | | |
| | | | 1 byte (for NUL terminator)
| | | 2 bytes
| | 3...4 bytes
| capacity: 4 bytes
zzz.c:1:6: note: destination declared here
1 | char d[4];
| ^
zzz.c: In function ‘g’:
zzz.c:46:12: warning: buffer overflow: ‘sprintf’ will write between 9 and 105
bytes into a destination of size 80 [-Wformat-overflow=]
46 | sprintf (buf_10, "/%s/%s-%i.tmp", tmpdir, fname, num);
| ^~~~~~ ^^~^^~^^~^~~~^
| | || || || | |
| | || || || | 1 byte (for NUL terminator)
| | || || || 4 bytes
| | || || |1...11 bytes
| | || || 1 byte
| | || |0...7 bytes
| | || 1 byte
| | |0...79 bytes
| | 1 byte
| capacity: 80 bytes
zzz.c:40:13: note: destination declared here
40 | extern char buf_10[80];
| ^~~~~~
The above is missing colorization, which can be seen at:
https://dmalcolm.fedorapeople.org/gcc/2018-09-14/zzz.html
where the secondary locations within the format string alternate
colors between odd and even secondary locations.
The above output format is rather verbose, but I think that it is
appropriate for this warning.
Rationale: What is the action that an end-user will want to take when
encountering a warning?
I think they will want to:
(a) review it and decide:
(a.1) "Is this a "true" result?"
(a.2) "Do I care?"
(b) if they decide it's genuine: "How do I fix this?"
(i.e. we should aim for our diagnostics to be "actionable")
Consider the sprintf/snprintf warnings. For the user to easily take the
above actions, the most helpful thing we can do is to provide the user
with clear information on how much data the format directives will write
when the code runs, and to compare that with the size of the destination buffer.
I think the typical fix the user will apply will be to increase the size
of the buffer, or to convert sprintf to snprintf: a buffer overflow is
typically very bad, whereas truncating a string can often be much less
serious. A less common fix will be to add or tweak width specifiers to
specific format directives.
To help the end-user to make these kinds of decisions, these warnings
need to clearly summarize the sizes of *all* of the various writes,
and compare them to the size of the destination buffer.
As I understand it, the current implementation iterates through the
format directives, tracking the possible ranges of data that could have
been written so far. It emits a warning at the first directive that can
have an overflow, and then stops analyzing the directives at that callsite.
The existing approach thus seems to focus on where the first overflow
can happen.
My proposed new approach analyzes all of the directives up-front, and if
an overflow can occur, then the warning that's emitted shows all of the
various sizes of writes that will happen.
Hence the reimplementation doesn't bother showing the point where the
overflow happens; instead it summarizes all of what will be written, to
help the user in deciding how to fix their code.
(Doing so reduces some of the combinatorial explosion in possible output
messages).
The patch also puts the words "buffer overflow" at the front of the
message where appropriate (to better grab the user's attention).
It also calls out NUL terminators, since people tend to make mistakes
with them.
There's a lot of FIXMEs in the resulting patch, but I wanted to run the
idea past you before fixing all the FIXMEs. In particular, I haven't
attempted to bootstrap and regression-test the new implementation.
(I suspect many of the existing tests are broken by this...). Consider
this a prototype.
[the interesting part of the patch kit is in patch 5/5]
Thoughts?