On Wed, 25 Jul 2018, Martin Sebor wrote: > > BUT - for the string_constant and c_strlen functions we are, > > in all cases we return something interesting, able to look > > at an initializer which then determines that type. Hopefully. > > I think the strlen() folding code when it sets SSA ranges > > now looks at types ...? > > > > Consider > > > > struct X { int i; char c[4]; int j;}; > > struct Y { char c[16]; }; > > > > void foo (struct X *p, struct Y *q) > > { > > memcpy (p, q, sizeof (struct Y)); > > if (strlen ((char *)(struct Y *)p + 4) < 7) > > abort (); > > } > > > > here the GIMPLE IL looks like > > > > const char * _1; > > > > <bb 2> [local count: 1073741825]: > > _5 = MEM[(char * {ref-all})q_4(D)]; > > MEM[(char * {ref-all})p_6(D)] = _5; > > _1 = p_6(D) + 4; > > _2 = __builtin_strlen (_1); > > > > and I guess Martin would argue that since p is of type struct X > > + 4 gets you to c[4] and thus strlen of that cannot be larger > > than 3. But of course the middle-end doesn't work like that > > and luckily we do not try to draw such conclusions or we > > are somehow lucky that for the testcase as written above we do not > > (I'm not sure whether Martins changes in this area would derive > > such conclusions in principle). > > Only if the strlen argument were p->c. > > > NOTE - we do not know the dynamic type here since we do not know > > the dynamic type of the memory pointed-to by q! We can only > > derive that at q+4 there must be some object that we can > > validly call strlen on (where Martin again thinks strlen > > imposes constrains that memchr does not - sth I do not agree > > with from a QOI perspective) > > The dynamic type is a murky area.
It's well-specified in the middle-end. A store changes the dynamic type of the stored-to object. If that type is compatible with the surrounding objects dynamic type that one is not affected, if not then the surrounding objects dynamic type becomes unspecified. There is TYPE_TYPELESS_STORAGE to somewhat control "compatibility" of subobjects. > As you said, above we don't > know whether *p is an allocated object or not. Strictly speaking, > we would need to treat it as such. It would basically mean > throwing out all type information and treating objects simply > as blobs of bytes. But that's not what GCC or other compilers do > either. It is what GCC does unless it sees a store to the memory. Basically pointers carry no type information, only (visible!) stores (and loads to some extent) provide information about dynamic types of objects (allocated or declared - GCC doesn't make a difference there). For instance, in the modified foo below, GCC eliminates > the test because it assumes that *p and *q don't overlap. It > does that because they are members of structs of unrelated types > access to which cannot alias. I.e., not just the type of > the access matters (here int and char) but so does the type of > the enclosing object. If it were otherwise and only the type > of the access mattered then eliminating the test below wouldn't > be valid (objects can have their stored value accessed by either > an lvalue of a compatible type or char). > > void foo (struct X *p, struct Y *q) > { > int j = p->j; > q->c[__builtin_offsetof (struct X, j)] = 0; > if (j != p->j) > __builtin_abort (); > } Here GCC sees both a load and a store where it derives the information from. And yes, it looks at the full access structure which contains a dereference of p and of q. Because of that and the fact that the store to q->c[] (which for GCC implies a store to *q!) that changes the dynamic type. > Clarifying (and adjusting if necessary) this area is among > the goals of the C object model proposal and the ongoing study > group. We have been talking about some of these cases there > and trying to come up with ways to let code do what it needs > to do without compromising existing language rules, which was > the consensus position within WG14 when the study group was > formed: i.e., to clarify or reaffirm existing rules and, in > cases of ambiguity or where the standard is unintentionally > overly permissive), favor tighter rules over looser ones. There is also the C++ object model and the Ada object model and ... GCC already has an object model in its middle-end and that is not going to change. And obviously it was modeled after the requirements from the languages the middle-end supports. The latest change was made necessary by C++ (placement new and storage re-use specifically). Richard.