On Thu, Dec 12, 2024 at 1:28 PM Martin Uecker <uec...@tugraz.at> wrote:
>
> Am Montag, dem 09.12.2024 um 16:20 +0000 schrieb Qing Zhao:
> >
> > > On Dec 7, 2024, at 03:57, Martin Uecker <uec...@tugraz.at> wrote:
> > >
> > > Am Freitag, dem 06.12.2024 um 16:13 +0000 schrieb Qing Zhao:
> > > >
> > > > > On Dec 6, 2024, at 10:56, Martin Uecker <uec...@tugraz.at> wrote:
> > > > >
> > > > > Am Freitag, dem 06.12.2024 um 14:16 +0000 schrieb Qing Zhao:
> > > > > >
> > >
> > > ...
> > >
> > > > > > >
> > > > > > > I think the relevant scenario is where you assign the struct and
> > > > > > > not a pointer to the struct, i.e. something like the following:
> > > > > > >
> > > > > > > #include "builtin-object-size-common.h"
> > > > > > >
> > > > > > > struct annotated {
> > > > > > > int b;
> > > > > > > int *c __attribute__ ((counted_by (b)));
> > > > > > > };
> > > > > > >
> > > > > > > struct annotated __attribute__((__noinline__)) setup (int 
> > > > > > > attr_count)
> > > > > > > {
> > > > > > > struct annotated p_array_annotated;
> > > > > > > p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count);
> > > > > > > p_array_annotated.b = attr_count;
> > > > > > >
> > > > > > > return p_array_annotated;
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > int main(int argc, char *argv[])
> > > > > > > {
> > > > > > > struct annotated x = setup (10);
> > > > > > > int *p = x.c;
> > > > > > > x = setup (20);
> > > > > > > EXPECT(__builtin_dynamic_object_size (p, 1), 10 * sizeof (int));
> > > > > > > EXPECT(__builtin_dynamic_object_size (x.c, 1), 20 * sizeof (int));
> > > > > > > DONE ();
> > > > > > > }
> > > > > > >
> > > > > >
> > > > > > With the above testing case, my current implementation based on 
> > > > > > .ACCESS_WITH_SIZE succeed without any issue.  -:)
> > > > > > The design of .ACCESS_WITH_SIZE already resolved this issue.
> > > > >
> > > > > Ok, thanks!  But I am a bit confused, because it seems it behaves
> > > > > this way also for FAMs
> > > > >
> > > > > https://godbolt.org/z/64a6z4cna
> > > >
> > > > The behavior of the above testing case is exactly the additional 
> > > > feature we provided for counted_by attribute for FAM:
> > > >
> > > > =====
> > > >     One important feature of the attribute is, a reference to the
> > > >     flexible array member field uses the latest value assigned to the
> > > >     field that represents the number of the elements before that
> > > >     reference.  For example,
> > > >
> > > >            p->count = val1;
> > > >            p->array[20] = 0;  // ref1 to p->array
> > > >            p->count = val2;
> > > >            p->array[30] = 0;  // ref2 to p->array
> > > >
> > > >     in the above, 'ref1' uses 'val1' as the number of the elements in
> > > >     'p->array', and 'ref2' uses 'val2' as the number of elements in
> > > >     'p->array’.
> > > > =====
> > > >
> > > > So, it’s the correct behavior for the counted_by attribute for FAM 
> > > > based on our previous discussion and agreement.
> > >
> > > If it is indeed that the value of p->count last stored before p->array is
> > > *referenced* which counts, then everything is well.
> >
> > Yes, For FAM, every “reference” to p->array will be converted as a call to 
> > (*.ACCESS_WITH_SIZE (p->array, &p->count, …))
>
> Can you remind why we have to pass the address of p->count, i.e. &p->count
> instead of its value?
>
So that if we change the value of p->count it will be reflected in
future checks.

p->count = n;
p->array[3] = x;
// ...
p->count = m;
p->array[3] = y;

We would want the last "p->array[3]" to be checked against the new
value of p->count rather than the original value.

> >
> > The count value for p->array is  *(&p->count), which is guaranteed to be 
> > the last stored value of the address of p->count before the current 
> > reference to p->array.
> >
> > Similarly, for the pointer array,  every “reference” to p->pa will be 
> > converted as a call to .ACCESS_WITH_SIZE(p->pa, &p->count…). The count 
> > value of the pointer array p->pa is *(&p->count), which is also guaranteed 
> > to be the last stored value of the address of p->count before the current 
> > reference to p->pa.
> >
> > > Somehow I thought for FAMs it is the value p->count last stored before
> > > p->array is *accessed* (possibly indirectly via another pointer).  
> > > Probably
> > > it was just me being confused.
> > >
> > > >
> > > > However, as you pointed out, when the “counted_by” attribute is 
> > > > extended to  the pointer field, this feature will be problematic.
> > > > And we need to add the following additional new requirement for the 
> > > > “counted_by” attribute of pointer field:
> > > >
> > > > p->count and  p->array  can only be changed by changing the whole 
> > > > structure at the same time.
> > >
> > > Actually, I am then not even sure we need this requirement. My point was 
> > > only that
> > > setting the whole structure at the time should work correctly, i.e. 
> > > without changing
> > > the bounds for old pointers which were stored in the struct previously.  
> > > With the
> > > semantics  above it seems this case also works automatically.
> >
> > For pointer field with counted_by attribute, if the p->count and p->pa are 
> > not set together when changing the whole structure, then for example:
> >
> > struct annotated {
> >   int b;
> >   int *c __attribute__ ((counted_by (b)));
> > };
> >
> > /* note, this routine only allocate the space for the pointer array field, 
> > but does NOT set the counted_by field.  */
> > struct annotated __attribute__((__noinline__)) setup (int attr_count)
> > {
> >   struct annotated p_array_annotated;
> >   p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count);
> >
> >   return p_array_annotated;
> > }
> >
> > int main(int argc, char *argv[])
> > {
> >   struct annotated x = setup (10);
> >   x.b = 10;  /* the counted_by is set here.  */
> >   int *p = x.c;       /* x.c’s count should be 10, which is correct.  */
> >   x = setup (20);
> >   __builtin_dynamic_object_size (x.c, 1); /* x.c’s count now should be 20, 
> > but it’s not set yet,
> > the object size is unknown at this moment.  Shall we ask the user to always 
> > set the
> > counted_by value when the corresponding pointer is changed?  */
>
> I think the requirement should simply be that the counted_by value
> is set before the pointer is assigned.
>
> We could then also check that the pointer which is assigned does
> not point into something smaller at the time where it is assigned.
>
This may be a bit off topic for this email, but I've been thinking
about performing bounds checking on an array that was assigned to a
local variable. Bounds checking via __counted_by requires the
__counted_by attribute be available, however it's *not* available if
we assign the pointer to a local variable. In Qing's example:

  struct annotated {
    int b;
    int *c __attribute__((counted_by(b)));
  };

  struct annotated __attribute__((__noinline__)) setup(int attr_count) {
    struct annotated *p;

    p = (struct annotated *)malloc(sizeof(struct annotated));
    p->b = attr_count;
    p->c = (int *)malloc(sizeof(int) * attr_count);

    return p;
  }

  int main(int argc, char **argv) {
    struct annotated *x = setup(10);
    int *p;

    p = x->c; // Loses the __counted_by attribute.
    x = setup(20);
    p[15] = argc;  // Can't check.

    return 0;
  }

The problem is the same as if the pointer were passed into a function.
(I'm not sure about an inlined function...maybe in GCC? Clang does
inling very differently.) There could also be complex code between
assigning to the local variable and where a check occurs that could
cause issues (though GCC's .ACCESS_WITH_SIZE could be robust enough to
deal with that). Say for instance that somewhere in the code path the
local is stored:

  int **store;

  store[37] = p;

  // code where 'store' is used a lot.

  p = store[37];

  printf("value %d\n", p[20]);  // Is it possible to add a check here?

All of this to say, it might be a good first step to simply ensure
that we will perform the sanitizer steps only when accessed via a
pointer to the struct. It might even avoid confusion by the programmer
if checks are performed in some cases (access through the struct
pointer, inlined functions, etc.) but not in others (passed into
non-inlined functions, complex code paths where it's hard to determine
the original source of the pointer).

-bw

Reply via email to