On 3/9/2022 5:39 PM, Siddhesh Poyarekar wrote:
The size argument larger than size of SRC for strnlen and strndup is
problematic only if SRC is not NULL terminated, which invokes undefined
behaviour. In all other cases, as long as SRC is large enough to have a
NULL char (i.e. size 1 or more), a larger N should not invoke a warning
during compilation.
Such a warning may be a suitable check for the static analyzer instead
with slightly different wording suggesting that choice of size argument
makes the function call equivalent to strlen/strdup.
This change results in the following code going through without a
warning:
------------------
char *buf;
char *
foo (void)
{
buf = __builtin_malloc (4);
__builtin_memset (buf, 'A', 4);
return __builtin_strndup (buf, 5);
}
int
main ()
{
__builtin_printf ("%s\n", foo ());
}
------------------
but the problem above is a missing NULL, not N being larger than the
size of SRC and the overread warning in this context is confusing at
best and misleading (and hinting at the wrong solution) in the worst
case.
gcc/ChangeLog:
middle-end/104854
* gimple-ssa-warn-access.cc (check_access<GimpleOrTree>):
New parameter. Skip warning if in read-only mode, source string
is NULL terminated and has non-zero object size.
(check_access<gimple *>): New parameter.
(check_access<tree>): Adjust.
(check_read_access): New parameter. Adjust for check_access
change.
(pass_waccess::check_builtin): Adjust check_read_access call for
memcmp, memchr.
(pass_waccess::maybe_check_access_sizes): Likewise.
gcc/testsuite/ChangeLog:
middle-end/104854
* gcc.dg/Wstringop-overread.c
(test_strnlen_array, test_strndup_array): Don't expect warning
for non-zero source sizes.
* gcc.dg/attr-nonstring-4.c (strnlen_range): Likewise.
* gcc.dg/pr78902.c: Likewise.
* gcc.dg/warn-strnlen-no-nul.c: Likewise.
I know this is old and the BZ has been set as CLOSED/INVALID. But it
was in my TODO list, and I've got thoughts here so I might as well
chime in ;-)
The potential overread warning for that code seems quite reasonable to
me. Yes it is the case that the length argument is sometimes
unrelated to the source string. But even then where's the line for when
we should and should not warn?
In my mind these middle end warnings can not and will not ever be 100%
accurate. Their goal is to say "hey, can someone please look at this
code very closely because it might be wrong" -- and that's probably
about the best we can do since many of these are trying to infer intent
of the programmer.
It almost feels like we want not just to have finer grained control over
this particular class of wrnings, but even within the class we may want
levels (and just to be clear, I'm not generally a fan of leveled
warnings as the higher levels rarely get used in practice). The
different levels could correspond to our ability to discern the source
of the length operand -- ie, it is related to the source operand, some
other operand or a constant and we could potentially treat each of those
cases differently.
Jeff