https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896

--- Comment #17 from Martin Uecker <muecker at gwdg dot de> ---
Am Freitag, dem 03.03.2023 um 23:18 +0000 schrieb isanbard at gmail dot com:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> 
> --- Comment #16 from Bill Wendling <isanbard at gmail dot com> ---
> (In reply to Martin Uecker from comment #15)
> > Am Freitag, dem 03.03.2023 um 20:27 +0000 schrieb isanbard at gmail dot com:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108896
> > > 
> > > --- Comment #14 from Bill Wendling <isanbard at gmail dot com> ---
> > > (In reply to Martin Uecker from comment #9)
> > > > > > Considering that the GNU extensions is rarely used, one could 
> > > > > > consider
> > > > > > redefining the meaning of
> > > > > > 
> > > > > > int n = 1;
> > > > > > struct {
> > > > > >   int n;
> > > > > >   char buf[n];
> > > > > > };
> > > > > > 
> > > > > > so that the 'n' refers to the member. Or we add a new syntax 
> > > > > > similar to
> > > > > > designators (which intuitively makes sense to me).
> > > > > designator might be better IMO.
> > > > > 
> > > > > a question here is:
> > > > > 
> > > > > for the following nested structure: 
> > > > > 
> > > > > struct object {
> > > > >         ...
> > > > >         char items;
> > > > >         ...
> > > > >         struct inner {
> > > > >                 ...
> > > > >                 int flex[];
> > > > >         };
> > > > > } *ptr;
> > > > > 
> > > > > what kind of syntax is good to represent the upper bound of "flex" in 
> > > > > the inner
> > > > > struct with "items" in the outer structure? any suggestion?
> > > > 
> > > > I would disallow it. At least at first. It also raises some
> > > > questions: For example, one could form a pointer to the inner
> > > > struct, and then it is not clear how 'items' could be accessed
> > > > anymore.
> > > > 
> > > 
> > > That would be limiting its use in the Linux kernel. It seems that there 
> > > are
> > > ways to refer to struct members already using something like "offsetof":
> > > 
> > > struct object {
> > >     ...
> > >     char items;
> > >     ...
> > >     struct inner {
> > >         ...
> > >         int flex[] __attribute__((__element_count__(offsetof(struct 
> > > object,
> > > items))));
> > >     };
> > > } *ptr;
> > 
> > This seems to be something completely different. offsetof
> > computes the offset from the type given in its argument.
> > But it would not access the value of the member of the
> > enclosing struct. But it would not work in your example,
> > because the struct object is incomplete at this point.
> > 
> > So no, you can not use offsetof to reference a member
> > of an enclosing struct.
> > 
> "offsetof(struct foo, count)" is a fancy wrapper for "((struct foo
> *)0)->count", which is resolved during sema, where it does have the full
> structure definition. For instance, this compiles in C++:
> 
> struct a {
>         int count;
>         int y = ((struct a *)0)->count;
> } x;
> 
> void foo(struct a *);

This is not the case in C: https://godbolt.org/z/P7M6cdnoa

But even we make it resolve the name, which we
have to do for all proposals, it is something  different.

If offsetof it would resolve the count of a different
struct of the same *type* (here a non-existent one at 
address zero). Here we need a self reference to the
same *object*.

...


> > But I am not saying we shouldn't have the attribute first.
> > 
> I personally prefer using an attribute than the suggestion to use some other
> syntax, like:
> 
> struct foo {
>     int fam[.count];
> };
> 
> It becomes less intuitive what's going on here. And might conflict with VLA's
> in structures.

The syntax with the dot would make it not conflict.  But I need
this for this use case

struct foo {
  int count;
  int (*buf)[.count];
};

so that ARRAY_SIZE(*foo->buf) works correctly and also accesses
to foo->buf are bounds checkked.  So it would make sense to 
solve to treat flexible array members in the same way.

But I agree that we should simply add the attribute now also
because it makes it possible to use it for existing code bases.

> > > It also has the benefit of
> > > allowing one to reference a variable not in the structure:
> > > 
> > > const int items;
> > > struct object {
> > >     ...
> > >     char items;
> > >     ...
> > >     struct inner {
> > >         ...
> > >         int flex[] __attribute__((__element_count__(items))); /* 
> > > References
> > > global "items" */
> > >     };
> > > } *ptr;
> > 
> > Whether you allow this or not has nothing to do with the syntax.
> > 
> > The question is what semantics you attach to this and this is
> > also a question in your example. 
> > 
> > If you define
> > 
> > struct inner* a = ...
> > 
> > What does it say for  a->flex  ?
> > 
> I need to point out that I used "offsetof" only as an example. It's possible 
> to
> create something more robust which will carry along type information, etc.,
> which the current offsetof macro throws away. I should have made that clear.
> 
> The sanitizers that see "a->flex" will try to find the correct variable. If
> they can't, then they won't generate a check. In the case of it referencing a
> non-field member, it'll use that if it's within scope. If it refers to a field
> member of a parent container that's not within scope, it'll also not generate 
> a
> check. It's unfortunate that these checks are done as a "best effort," but it
> could lead to software changes to improve security checks (like passing a
> parent structure into a function rather than an inner structure.

Yes. We could also have an optional warning which warns about accessing
'flex' in a context where 'items' is not accessible.  My point is that
this feature of potentially referring to stuff which may not be accessible
in all cases makes implementation more complicated.

Martin

>

Reply via email to