> On Dec 12, 2024, at 17:36, Martin Uecker <uec...@tugraz.at> wrote:
> 
> Am Donnerstag, dem 12.12.2024 um 13:59 -0800 schrieb Bill Wendling:
>>>>>> 
>>>>>> 
>>>>>> So, it’s the correct behavior for the counted_by attribute for FAM based 
>>>>>> on our previous discussion and agreement.
>>>>> 
>>>>> If it is indeed that the value of p->count last stored before p->array is
>>>>> *referenced* which counts, then everything is well.
>>>> 
>>>> Yes, For FAM, every “reference” to p->array will be converted as a call to 
>>>> (*.ACCESS_WITH_SIZE (p->array, &p->count, …))
>>> 
>>> Can you remind why we have to pass the address of p->count, i.e. &p->count
>>> instead of its value?
>>> 
>> So that if we change the value of p->count it will be reflected in
>> future checks.
>> 
>> p->count = n;
>> p->array[3] = x;
>> // ...
>> p->count = m;
>> p->array[3] = y;
>> 
>> We would want the last "p->array[3]" to be checked against the new
>> value of p->count rather than the original value.
> 
> But wouldn't at the second access to p->array not a new call to
> .ACCESS_WITH_SIZE be inserted?
Yes, a call to .ACCESS_WITH_SIZE is inserted for every reference to the pointer 
array. 
For the following testing case:

struct annotated {
  int b;
  int c[] __attribute__ ((counted_by (b)));
} *p_array_annotated;

int main(int argc, char *argv[])
{
  p_array_annotated
    = (struct annotated *)malloc (sizeof (struct annotated) + (10 * sizeof 
(int)));
  p_array_annotated->b = 10;

  p_array_annotated->c[9] = 2;
  p_array_annotated->b = 20;
  p_array_annotated->c[15] = 2;
  return 0;
}


The IR for “main” is:

{
  p_array_annotated = (struct annotated *) malloc (44);
  p_array_annotated->b = 10;
  (*.ACCESS_WITH_SIZE ((int *) &p_array_annotated->c, &p_array_annotated->b, 1, 
0, -1, 0B))[9] = 2;
  p_array_annotated->b = 20;
  (*.ACCESS_WITH_SIZE ((int *) &p_array_annotated->c, &p_array_annotated->b, 1, 
0, -1, 0B))[15] = 2;
  return 0;
}

From the above IR, we can see that the &p_array_annotated->b (the address of 
the counted-by object corresponding to the array object p_array_annotated->c) 
is passed to every call to .ACCESS_WITH_SIZE.   Doing this allows the value of 
p_array_annotated->b being updated between two references to the same pointer 
array. ( if I remember correctly, it’s mainly for the purpose to support the 
feature for FAM:

     One important feature of the attribute is, a reference to the
     flexible array member field uses the latest value assigned to the
     field that represents the number of the elements before that
     reference.  For example,

            p->count = val1;
            p->array[20] = 0;  // ref1 to p->array
            p->count = val2;
            p->array[30] = 0;  // ref2 to p->array

     in the above, 'ref1' uses 'val1' as the number of the elements in
     'p->array', and 'ref2' uses 'val2' as the number of elements in
     'p->array’.
)


Qing

> 
>> 
>>>> 
>>>> The count value for p->array is  *(&p->count), which is guaranteed to be 
>>>> the last stored value of the address of p->count before the current 
>>>> reference to p->array.
>>>> 
>>>> Similarly, for the pointer array,  every “reference” to p->pa will be 
>>>> converted as a call to .ACCESS_WITH_SIZE(p->pa, &p->count…). The count 
>>>> value of the pointer array p->pa is *(&p->count), which is also guaranteed 
>>>> to be the last stored value of the address of p->count before the current 
>>>> reference to p->pa.
>>>> 
>>>>> Somehow I thought for FAMs it is the value p->count last stored before
>>>>> p->array is *accessed* (possibly indirectly via another pointer).  
>>>>> Probably
>>>>> it was just me being confused.
>>>>> 
>>>>>> 
>>>>>> However, as you pointed out, when the “counted_by” attribute is extended 
>>>>>> to  the pointer field, this feature will be problematic.
>>>>>> And we need to add the following additional new requirement for the 
>>>>>> “counted_by” attribute of pointer field:
>>>>>> 
>>>>>> p->count and  p->array  can only be changed by changing the whole 
>>>>>> structure at the same time.
>>>>> 
>>>>> Actually, I am then not even sure we need this requirement. My point was 
>>>>> only that
>>>>> setting the whole structure at the time should work correctly, i.e. 
>>>>> without changing
>>>>> the bounds for old pointers which were stored in the struct previously.  
>>>>> With the
>>>>> semantics  above it seems this case also works automatically.
>>>> 
>>>> For pointer field with counted_by attribute, if the p->count and p->pa are 
>>>> not set together when changing the whole structure, then for example:
>>>> 
>>>> struct annotated {
>>>>  int b;
>>>>  int *c __attribute__ ((counted_by (b)));
>>>> };
>>>> 
>>>> /* note, this routine only allocate the space for the pointer array field, 
>>>> but does NOT set the counted_by field.  */
>>>> struct annotated __attribute__((__noinline__)) setup (int attr_count)
>>>> {
>>>>  struct annotated p_array_annotated;
>>>>  p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count);
>>>> 
>>>>  return p_array_annotated;
>>>> }
>>>> 
>>>> int main(int argc, char *argv[])
>>>> {
>>>>  struct annotated x = setup (10);
>>>>  x.b = 10;  /* the counted_by is set here.  */
>>>>  int *p = x.c;       /* x.c’s count should be 10, which is correct.  */
>>>>  x = setup (20);
>>>>  __builtin_dynamic_object_size (x.c, 1); /* x.c’s count now should be 20, 
>>>> but it’s not set yet,
>>>> the object size is unknown at this moment.  Shall we ask the user to 
>>>> always set the
>>>> counted_by value when the corresponding pointer is changed?  */
>>> 
>>> I think the requirement should simply be that the counted_by value
>>> is set before the pointer is assigned.
>>> 
>>> We could then also check that the pointer which is assigned does
>>> not point into something smaller at the time where it is assigned.
>>> 
>> This may be a bit off topic for this email, but I've been thinking
>> about performing bounds checking on an array that was assigned to a
>> local variable. Bounds checking via __counted_by requires the
>> __counted_by attribute be available, however it's *not* available if
>> we assign the pointer to a local variable. 
> 
> It is available:
> 
> https://godbolt.org/z/KbqMPYfK3
>> 
>> In Qing's example:
>> 
>>  struct annotated {
>>    int b;
>>    int *c __attribute__((counted_by(b)));
>>  };
>> 
>>  struct annotated __attribute__((__noinline__)) setup(int attr_count) {
>>    struct annotated *p;
>> 
>>    p = (struct annotated *)malloc(sizeof(struct annotated));
>>    p->b = attr_count;
>>    p->c = (int *)malloc(sizeof(int) * attr_count);
>> 
>>    return p;
>>  }
>> 
>>  int main(int argc, char **argv) {
>>    struct annotated *x = setup(10);
>>    int *p;
>> 
>>    p = x->c; // Loses the __counted_by attribute.
>>    x = setup(20);
>>    p[15] = argc;  // Can't check.
>> 
>>    return 0;
>>  }
>> 
>> The problem is the same as if the pointer were passed into a function.
>> (I'm not sure about an inlined function...maybe in GCC? Clang does
>> inling very differently.) There could also be complex code between
>> assigning to the local variable and where a check occurs that could
>> cause issues (though GCC's .ACCESS_WITH_SIZE could be robust enough to
>> deal with that). Say for instance that somewhere in the code path the
>> local is stored:
>> 
>>  int **store;
>> 
>>  store[37] = p;
>> 
>>  // code where 'store' is used a lot.
>> 
>>  p = store[37];
>> 
>>  printf("value %d\n", p[20]);  // Is it possible to add a check here?
>> 
>> All of this to say, it might be a good first step to simply ensure
>> that we will perform the sanitizer steps only when accessed via a
>> pointer to the struct. It might even avoid confusion by the programmer
>> if checks are performed in some cases (access through the struct
>> pointer, inlined functions, etc.) but not in others (passed into
>> non-inlined functions, complex code paths where it's hard to determine
>> the original source of the pointer).
> 
> For me, the clearest way is to use C99's variably modified types
> which allows passing the bound into a function in another
> TU. I used similar features in my experimental container library:
> 
> 
> https://godbolt.org/z/o6j896Th5
> 
> But this does not help existing code.
> 
> Martin


Reply via email to