https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259

--- Comment #18 from Martin Sebor <msebor at gcc dot gnu.org> ---
> Similarly for (char *) conversions as performed by str* calls,
> 6.3.2.3/7 applies which says "When a pointer to _an object_ is
> converted to a pointer to a character type, the result points
> to the lowest addressed byte of the object.  Successive increments
> of the result, up to the size of the object, yield pointers to the
> remaining bytes of the object."

The intent of this text is to require implementations to store objects at
consecutive memory locations (as opposed to, for instance, interleaving them
somehow).  It's not a license for programs to disregard restrictions on
accessing objects and their subobjects outlined later in the text, such as
those in 6.5.6 for additive operators.

> As a general comment I find it disturbing that the user
> is required to write (char *)&s2 + offsetof(S, a) instead
> of plain &s2.a (whatever offset a actually has, thus considering
> my second example as well) when calling str* or mem* routines.

It's a simple rule: to access a part/member of an object via a pointer, the
pointer must be derived from the address of the object (and not from one of its
subobjects).  This applies recursively to subobjects.

> That said, as of QOI I seriously doubt taking advantage of
> the above for optimization purposes is a good idea given
> there isn't even a flag like -fwrapv or -fno-strict-aliasing
> to disable it!

There are many micro-optimizations in GCC that rely on the absence of undefined
behavior in a program that cannot be disabled by an option (such as those in
gimple-fold).  I have made a lot of effort to detect and diagnose the undefined
behavior when possible, but more often than not, optimization is given
preference to detecting undefined behavior, even if it means generating
invalid/undefined code.  For example, even at -O0 GCC emits code that writes
past the end of the array below even though the buffer overflow is trivially
detectable.  There is no option to avoid it.  It wasn't until GCC 8 and
-Wrestrict that this has been diagnosed (only with optimization), and more as a
happy accident than as a result of a conscious effort.  If GCC avoided folding
the invalid memcpy() call into a MEM_REF the overflow would be detected and
prevented by _FORTIFY_SOURCE.

  char a[1];

  void f (const void *s)
  {
    __builtin_memcpy (a, s, 4);
  }

A different example that isn't diagnosed even today as a result of folding is
the one below.  GCC issues no warning here because the call to strcpy() is
folded into memcpy() which is allowed to cross subobject boundaries.  If the
folder instead took member sizes into consideration and avoided folding
strcpy() into memcpy() the bug would be detected.  Despite that, there is no
option to disable this obviously questionable choice.

  struct S {
    char d[4];
    void (*f)();
  } s;

  void f (void)
  {
    __builtin_strcpy (s.d, "1234567");
  }

These examples aren't meant as an argument against changing things to detect
(and prevent) these kinds of bugs.  The point is to illustrate that the
strlen() case we have been discussing is by no means unique.  The big
difference between these examples and the strlen() case is that the former are
trivial to do something about, while the latter will be much harder.

Reply via email to