Am Donnerstag, dem 12.12.2024 um 13:59 -0800 schrieb Bill Wendling: > On Thu, Dec 12, 2024 at 1:28 PM Martin Uecker <uec...@tugraz.at> wrote: > > > > Am Montag, dem 09.12.2024 um 16:20 +0000 schrieb Qing Zhao: > > > > > > > On Dec 7, 2024, at 03:57, Martin Uecker <uec...@tugraz.at> wrote: > > > > > > > > Am Freitag, dem 06.12.2024 um 16:13 +0000 schrieb Qing Zhao: > > > > > > > > > > > On Dec 6, 2024, at 10:56, Martin Uecker <uec...@tugraz.at> wrote: > > > > > > > > > > > > Am Freitag, dem 06.12.2024 um 14:16 +0000 schrieb Qing Zhao: > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > I think the relevant scenario is where you assign the struct and > > > > > > > > not a pointer to the struct, i.e. something like the following: > > > > > > > > > > > > > > > > #include "builtin-object-size-common.h" > > > > > > > > > > > > > > > > struct annotated { > > > > > > > > int b; > > > > > > > > int *c __attribute__ ((counted_by (b))); > > > > > > > > }; > > > > > > > > > > > > > > > > struct annotated __attribute__((__noinline__)) setup (int > > > > > > > > attr_count) > > > > > > > > { > > > > > > > > struct annotated p_array_annotated; > > > > > > > > p_array_annotated.c = (int *) malloc (sizeof (int) * > > > > > > > > attr_count); > > > > > > > > p_array_annotated.b = attr_count; > > > > > > > > > > > > > > > > return p_array_annotated; > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > int main(int argc, char *argv[]) > > > > > > > > { > > > > > > > > struct annotated x = setup (10); > > > > > > > > int *p = x.c; > > > > > > > > x = setup (20); > > > > > > > > EXPECT(__builtin_dynamic_object_size (p, 1), 10 * sizeof (int)); > > > > > > > > EXPECT(__builtin_dynamic_object_size (x.c, 1), 20 * sizeof > > > > > > > > (int)); > > > > > > > > DONE (); > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > With the above testing case, my current implementation based on > > > > > > > .ACCESS_WITH_SIZE succeed without any issue. -:) > > > > > > > The design of .ACCESS_WITH_SIZE already resolved this issue. > > > > > > > > > > > > Ok, thanks! But I am a bit confused, because it seems it behaves > > > > > > this way also for FAMs > > > > > > > > > > > > https://godbolt.org/z/64a6z4cna > > > > > > > > > > The behavior of the above testing case is exactly the additional > > > > > feature we provided for counted_by attribute for FAM: > > > > > > > > > > ===== > > > > > One important feature of the attribute is, a reference to the > > > > > flexible array member field uses the latest value assigned to the > > > > > field that represents the number of the elements before that > > > > > reference. For example, > > > > > > > > > > p->count = val1; > > > > > p->array[20] = 0; // ref1 to p->array > > > > > p->count = val2; > > > > > p->array[30] = 0; // ref2 to p->array > > > > > > > > > > in the above, 'ref1' uses 'val1' as the number of the elements in > > > > > 'p->array', and 'ref2' uses 'val2' as the number of elements in > > > > > 'p->array’. > > > > > ===== > > > > > > > > > > So, it’s the correct behavior for the counted_by attribute for FAM > > > > > based on our previous discussion and agreement. > > > > > > > > If it is indeed that the value of p->count last stored before p->array > > > > is > > > > *referenced* which counts, then everything is well. > > > > > > Yes, For FAM, every “reference” to p->array will be converted as a call > > > to (*.ACCESS_WITH_SIZE (p->array, &p->count, …)) > > > > Can you remind why we have to pass the address of p->count, i.e. &p->count > > instead of its value? > > > So that if we change the value of p->count it will be reflected in > future checks. > > p->count = n; > p->array[3] = x; > // ... > p->count = m; > p->array[3] = y; > > We would want the last "p->array[3]" to be checked against the new > value of p->count rather than the original value.
But wouldn't at the second access to p->array not a new call to .ACCESS_WITH_SIZE be inserted? > > > > > > > The count value for p->array is *(&p->count), which is guaranteed to be > > > the last stored value of the address of p->count before the current > > > reference to p->array. > > > > > > Similarly, for the pointer array, every “reference” to p->pa will be > > > converted as a call to .ACCESS_WITH_SIZE(p->pa, &p->count…). The count > > > value of the pointer array p->pa is *(&p->count), which is also > > > guaranteed to be the last stored value of the address of p->count before > > > the current reference to p->pa. > > > > > > > Somehow I thought for FAMs it is the value p->count last stored before > > > > p->array is *accessed* (possibly indirectly via another pointer). > > > > Probably > > > > it was just me being confused. > > > > > > > > > > > > > > However, as you pointed out, when the “counted_by” attribute is > > > > > extended to the pointer field, this feature will be problematic. > > > > > And we need to add the following additional new requirement for the > > > > > “counted_by” attribute of pointer field: > > > > > > > > > > p->count and p->array can only be changed by changing the whole > > > > > structure at the same time. > > > > > > > > Actually, I am then not even sure we need this requirement. My point > > > > was only that > > > > setting the whole structure at the time should work correctly, i.e. > > > > without changing > > > > the bounds for old pointers which were stored in the struct previously. > > > > With the > > > > semantics above it seems this case also works automatically. > > > > > > For pointer field with counted_by attribute, if the p->count and p->pa > > > are not set together when changing the whole structure, then for example: > > > > > > struct annotated { > > > int b; > > > int *c __attribute__ ((counted_by (b))); > > > }; > > > > > > /* note, this routine only allocate the space for the pointer array > > > field, but does NOT set the counted_by field. */ > > > struct annotated __attribute__((__noinline__)) setup (int attr_count) > > > { > > > struct annotated p_array_annotated; > > > p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count); > > > > > > return p_array_annotated; > > > } > > > > > > int main(int argc, char *argv[]) > > > { > > > struct annotated x = setup (10); > > > x.b = 10; /* the counted_by is set here. */ > > > int *p = x.c; /* x.c’s count should be 10, which is correct. */ > > > x = setup (20); > > > __builtin_dynamic_object_size (x.c, 1); /* x.c’s count now should be > > > 20, but it’s not set yet, > > > the object size is unknown at this moment. Shall we ask the user to > > > always set the > > > counted_by value when the corresponding pointer is changed? */ > > > > I think the requirement should simply be that the counted_by value > > is set before the pointer is assigned. > > > > We could then also check that the pointer which is assigned does > > not point into something smaller at the time where it is assigned. > > > This may be a bit off topic for this email, but I've been thinking > about performing bounds checking on an array that was assigned to a > local variable. Bounds checking via __counted_by requires the > __counted_by attribute be available, however it's *not* available if > we assign the pointer to a local variable. It is available: https://godbolt.org/z/KbqMPYfK3 > > In Qing's example: > > struct annotated { > int b; > int *c __attribute__((counted_by(b))); > }; > > struct annotated __attribute__((__noinline__)) setup(int attr_count) { > struct annotated *p; > > p = (struct annotated *)malloc(sizeof(struct annotated)); > p->b = attr_count; > p->c = (int *)malloc(sizeof(int) * attr_count); > > return p; > } > > int main(int argc, char **argv) { > struct annotated *x = setup(10); > int *p; > > p = x->c; // Loses the __counted_by attribute. > x = setup(20); > p[15] = argc; // Can't check. > > return 0; > } > > The problem is the same as if the pointer were passed into a function. > (I'm not sure about an inlined function...maybe in GCC? Clang does > inling very differently.) There could also be complex code between > assigning to the local variable and where a check occurs that could > cause issues (though GCC's .ACCESS_WITH_SIZE could be robust enough to > deal with that). Say for instance that somewhere in the code path the > local is stored: > > int **store; > > store[37] = p; > > // code where 'store' is used a lot. > > p = store[37]; > > printf("value %d\n", p[20]); // Is it possible to add a check here? > > All of this to say, it might be a good first step to simply ensure > that we will perform the sanitizer steps only when accessed via a > pointer to the struct. It might even avoid confusion by the programmer > if checks are performed in some cases (access through the struct > pointer, inlined functions, etc.) but not in others (passed into > non-inlined functions, complex code paths where it's hard to determine > the original source of the pointer). For me, the clearest way is to use C99's variably modified types which allows passing the bound into a function in another TU. I used similar features in my experimental container library: https://godbolt.org/z/o6j896Th5 But this does not help existing code. Martin