> On Dec 12, 2024, at 17:36, Martin Uecker <uec...@tugraz.at> wrote: > > Am Donnerstag, dem 12.12.2024 um 13:59 -0800 schrieb Bill Wendling: >>>>>> >>>>>> >>>>>> So, it’s the correct behavior for the counted_by attribute for FAM based >>>>>> on our previous discussion and agreement. >>>>> >>>>> If it is indeed that the value of p->count last stored before p->array is >>>>> *referenced* which counts, then everything is well. >>>> >>>> Yes, For FAM, every “reference” to p->array will be converted as a call to >>>> (*.ACCESS_WITH_SIZE (p->array, &p->count, …)) >>> >>> Can you remind why we have to pass the address of p->count, i.e. &p->count >>> instead of its value? >>> >> So that if we change the value of p->count it will be reflected in >> future checks. >> >> p->count = n; >> p->array[3] = x; >> // ... >> p->count = m; >> p->array[3] = y; >> >> We would want the last "p->array[3]" to be checked against the new >> value of p->count rather than the original value. > > But wouldn't at the second access to p->array not a new call to > .ACCESS_WITH_SIZE be inserted? Yes, a call to .ACCESS_WITH_SIZE is inserted for every reference to the pointer array. For the following testing case:
struct annotated { int b; int c[] __attribute__ ((counted_by (b))); } *p_array_annotated; int main(int argc, char *argv[]) { p_array_annotated = (struct annotated *)malloc (sizeof (struct annotated) + (10 * sizeof (int))); p_array_annotated->b = 10; p_array_annotated->c[9] = 2; p_array_annotated->b = 20; p_array_annotated->c[15] = 2; return 0; } The IR for “main” is: { p_array_annotated = (struct annotated *) malloc (44); p_array_annotated->b = 10; (*.ACCESS_WITH_SIZE ((int *) &p_array_annotated->c, &p_array_annotated->b, 1, 0, -1, 0B))[9] = 2; p_array_annotated->b = 20; (*.ACCESS_WITH_SIZE ((int *) &p_array_annotated->c, &p_array_annotated->b, 1, 0, -1, 0B))[15] = 2; return 0; } From the above IR, we can see that the &p_array_annotated->b (the address of the counted-by object corresponding to the array object p_array_annotated->c) is passed to every call to .ACCESS_WITH_SIZE. Doing this allows the value of p_array_annotated->b being updated between two references to the same pointer array. ( if I remember correctly, it’s mainly for the purpose to support the feature for FAM: One important feature of the attribute is, a reference to the flexible array member field uses the latest value assigned to the field that represents the number of the elements before that reference. For example, p->count = val1; p->array[20] = 0; // ref1 to p->array p->count = val2; p->array[30] = 0; // ref2 to p->array in the above, 'ref1' uses 'val1' as the number of the elements in 'p->array', and 'ref2' uses 'val2' as the number of elements in 'p->array’. ) Qing > >> >>>> >>>> The count value for p->array is *(&p->count), which is guaranteed to be >>>> the last stored value of the address of p->count before the current >>>> reference to p->array. >>>> >>>> Similarly, for the pointer array, every “reference” to p->pa will be >>>> converted as a call to .ACCESS_WITH_SIZE(p->pa, &p->count…). The count >>>> value of the pointer array p->pa is *(&p->count), which is also guaranteed >>>> to be the last stored value of the address of p->count before the current >>>> reference to p->pa. >>>> >>>>> Somehow I thought for FAMs it is the value p->count last stored before >>>>> p->array is *accessed* (possibly indirectly via another pointer). >>>>> Probably >>>>> it was just me being confused. >>>>> >>>>>> >>>>>> However, as you pointed out, when the “counted_by” attribute is extended >>>>>> to the pointer field, this feature will be problematic. >>>>>> And we need to add the following additional new requirement for the >>>>>> “counted_by” attribute of pointer field: >>>>>> >>>>>> p->count and p->array can only be changed by changing the whole >>>>>> structure at the same time. >>>>> >>>>> Actually, I am then not even sure we need this requirement. My point was >>>>> only that >>>>> setting the whole structure at the time should work correctly, i.e. >>>>> without changing >>>>> the bounds for old pointers which were stored in the struct previously. >>>>> With the >>>>> semantics above it seems this case also works automatically. >>>> >>>> For pointer field with counted_by attribute, if the p->count and p->pa are >>>> not set together when changing the whole structure, then for example: >>>> >>>> struct annotated { >>>> int b; >>>> int *c __attribute__ ((counted_by (b))); >>>> }; >>>> >>>> /* note, this routine only allocate the space for the pointer array field, >>>> but does NOT set the counted_by field. */ >>>> struct annotated __attribute__((__noinline__)) setup (int attr_count) >>>> { >>>> struct annotated p_array_annotated; >>>> p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count); >>>> >>>> return p_array_annotated; >>>> } >>>> >>>> int main(int argc, char *argv[]) >>>> { >>>> struct annotated x = setup (10); >>>> x.b = 10; /* the counted_by is set here. */ >>>> int *p = x.c; /* x.c’s count should be 10, which is correct. */ >>>> x = setup (20); >>>> __builtin_dynamic_object_size (x.c, 1); /* x.c’s count now should be 20, >>>> but it’s not set yet, >>>> the object size is unknown at this moment. Shall we ask the user to >>>> always set the >>>> counted_by value when the corresponding pointer is changed? */ >>> >>> I think the requirement should simply be that the counted_by value >>> is set before the pointer is assigned. >>> >>> We could then also check that the pointer which is assigned does >>> not point into something smaller at the time where it is assigned. >>> >> This may be a bit off topic for this email, but I've been thinking >> about performing bounds checking on an array that was assigned to a >> local variable. Bounds checking via __counted_by requires the >> __counted_by attribute be available, however it's *not* available if >> we assign the pointer to a local variable. > > It is available: > > https://godbolt.org/z/KbqMPYfK3 >> >> In Qing's example: >> >> struct annotated { >> int b; >> int *c __attribute__((counted_by(b))); >> }; >> >> struct annotated __attribute__((__noinline__)) setup(int attr_count) { >> struct annotated *p; >> >> p = (struct annotated *)malloc(sizeof(struct annotated)); >> p->b = attr_count; >> p->c = (int *)malloc(sizeof(int) * attr_count); >> >> return p; >> } >> >> int main(int argc, char **argv) { >> struct annotated *x = setup(10); >> int *p; >> >> p = x->c; // Loses the __counted_by attribute. >> x = setup(20); >> p[15] = argc; // Can't check. >> >> return 0; >> } >> >> The problem is the same as if the pointer were passed into a function. >> (I'm not sure about an inlined function...maybe in GCC? Clang does >> inling very differently.) There could also be complex code between >> assigning to the local variable and where a check occurs that could >> cause issues (though GCC's .ACCESS_WITH_SIZE could be robust enough to >> deal with that). Say for instance that somewhere in the code path the >> local is stored: >> >> int **store; >> >> store[37] = p; >> >> // code where 'store' is used a lot. >> >> p = store[37]; >> >> printf("value %d\n", p[20]); // Is it possible to add a check here? >> >> All of this to say, it might be a good first step to simply ensure >> that we will perform the sanitizer steps only when accessed via a >> pointer to the struct. It might even avoid confusion by the programmer >> if checks are performed in some cases (access through the struct >> pointer, inlined functions, etc.) but not in others (passed into >> non-inlined functions, complex code paths where it's hard to determine >> the original source of the pointer). > > For me, the clearest way is to use C99's variably modified types > which allows passing the bound into a function in another > TU. I used similar features in my experimental container library: > > > https://godbolt.org/z/o6j896Th5 > > But this does not help existing code. > > Martin