On Wed, Mar 16, 2022 at 06:54:51AM +0530, Siddhesh Poyarekar wrote:
> On 16/03/2022 02:06, Martin Sebor wrote:
> > The intended use of the strncmp bound is to limit the comparison to
> > at most the size of the arrays or (in a subset of cases) the length
> > of an initial substring. Providing an arbitrary bound that's not
> > related to the sizes as you describe sounds very much like a misuse.
>
> Nothing in the standard says that the bound is related to the sizes of input
> buffers. I don't think deducing that intent makes sense either, nor
> concluding that any other use case is misuse.
>
> > As a historical note, strncmp was first introduced in UNIX v7 where
> > its purpose, alongside strncpy, was to manipulate (potentially)
> > unterminated character arrays like file names stored in fixed size
> > arrays (typically 14 bytes). Strncpy would fill the buffers with
> > ASCII data up to their size and pad the rest with nuls only if there
> > was room.
> >
> > Strncmp was then used to compare these potentially unterminated
> > character arrays (e.g., archive headers in ld and ranlib). The bound
> > was the size of the fixed size array. Its other use case was to compare
> > leading portions of strings (e.g, when looking for an environment
> > variable or when stripping "./" from path names).
>
> Thanks for sharing the historical perspective.
>
> > Since the early UNIX days, both strncpy and to a lesser extent strncmp
> > have been widely misused and, along with many other functions in
> > <string.h>, a frequent source of bugs due to common misunderstanding
> > of their intended purpose. The aim of these warnings is to detect
> > the common (and sometimes less common) misuses and bugs.
>
> They're all valid uses however since they do not violate the standard. If we
> find at compile time that the strings don't terminate at the bounds,
> emitting the warning is OK but the more pessimistic check seems like
> overkill.
I think I agree, I've tried
#include <string.h>
char a[] = "abc";
char b[] = "abcd";
int f (void)
{
return strncmp (a, b, 8);
}
where I get
t.c:7:10: warning: ‘strncmp’ specified bound 8 exceeds source size 5
[-Wstringop-overread]
7 | return strncmp (a, b, 8); // -Wstringop-overread
| ^~~~~~~~~~~~~~~~~
even without -Wall. strncmp sees that a[3] is '\0' so it stops comparing
and there's no UB.
GCC 11 didn't emit that, so +1 for dialing this warning down.
> > I haven't seen these so I can't very well comment on them. But I can
> > assure you that warning for the code above is intentional. Whether
> > or not the arrays are nul-terminated, the expected way to call
> > the function is with a bound no greater than their size (some coding
> > guidelines are explicit about this; see for example the CERT C Secure
> > Coding standard rule ARR38-C).
> >
> > (Granted, the manual makes it sound like -Wstringop-overread only
> > detects provable past-the-end reads. That's a mistake in
> > the documentation that should be fixed. The warning was never quite
> > so limited, nor was it intended to be.)
>
> The contention is not that it's not provable, it's more that it's doesn't
> even pass the "based on available information this is definitely buggy"
> assertion, making it more a strong suggestion than a warning that something
> is definitely amiss. Which is why IMO it is more suitable as an analyzer
> check than a warning.
>
> Thanks,
> Siddhesh
>
Marek