On 08/01/2018 01:19 AM, Richard Biener wrote: > On Tue, 31 Jul 2018, Martin Sebor wrote: > >> On 07/31/2018 09:48 AM, Jakub Jelinek wrote: >>> On Tue, Jul 31, 2018 at 09:17:52AM -0600, Martin Sebor wrote: >>>> On 07/31/2018 12:38 AM, Jakub Jelinek wrote: >>>>> On Mon, Jul 30, 2018 at 09:45:49PM -0600, Martin Sebor wrote: >>>>>> Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past >>>>>> the end of subobjects by string functions. With _FORTIFY_SOURCE=2 >>>>>> it calls abort. This is the default on popular distributions, >>>>> >>>>> Note that _FORTIFY_SOURCE=2 is the mode that goes beyond what the >>>>> standard >>>>> requires, imposes extra requirements. So from what this mode accepts or >>>>> rejects we shouldn't determine what is or isn't considered valid. >>>> >>>> I'm not sure what the additional requirements are but the ones >>>> I am referring to are the enforcing of struct member boundaries. >>>> This is in line with the standard requirements of not accessing >>>> [sub]objects via pointers derived from other [sub]objects. >>> >>> In the middle-end the distinction between what was originally a reference >>> to subobjects and what was a reference to objects is quickly lost >>> (whether through SCCVN or other optimizations). >>> We've run into this many times with the __builtin_object_size already. >>> So, if e.g. >>> struct S { char a[3]; char b[5]; } s = { "abc", "defg" }; >>> ... >>> strlen ((char *) &s) is well defined but >>> strlen (s.a) is not in C, for the middle-end you might not figure out which >>> one is which. >> >> Yes, I'm aware of the middle-end transformation to MEM_REF >> -- it's one of the reasons why detecting invalid accesses >> by the middle end warnings, including -Warray-bounds, >> -Wformat-overflow, -Wsprintf-overflow, and even -Wrestrict, >> is less than perfect. >> >> But is strlen(s.a) also meant to be well-defined in the middle >> end (with the semantics of computing the length or "abcdefg"?) > > Yes. > >> And if so, what makes it well defined? > > The fact that strlen takes a char * argument and thus inline-expansion > of a trivial implementation like [ ... ] And ISTM again the key here is the type of the object that actually gets passed to strlen at the gimple level. If it's a char *, then the type does not constrain the return value in any way shape or form.
Jeff