> On Jul 22, 2025, at 20:12, Siddhesh Poyarekar <siddh...@gotplt.org> wrote:
>
> [Apologies if I've missed some context in my reading since I'm coming back to
> this after a big break]
>
> On 2025-07-22 12:33, Qing Zhao wrote:
>> Why it's wrong to pass the VALUE of the original pointer as the first
>> argument to
>> the call to .ACCESS_WITH_SIZE
>> For a pointer field with counted_by attribute:
>> struct S {
>> int n;
>> int *p __attribute__((counted_by(n)));
>> } *f;
>> f->p
>> if we pass the VALUE of the original pointer f->p as the first argument,
>> and also return the original pointer:
>> .ACCESS_WITH_SIZE (f->p, &f->n,...)
>> the IL for the above is:
>> tmp1 = f->p;
>> tmp2 = &f->n;
>> tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);
>> In the above, in order to generate a call to .ACCESS_WITH_SIZE for the
>> pointer
>> reference f->p, we have to add the new GIMPLE tmp1 = f->p to pass the value
>> of
>> the pointer f->p to the call to .ACCESS_WITH_SIZE.
>> This new gimple reads the VALUE of the pointer f->p. It will introduce
>> undefined
>> behavior if it is inserted in the program BEFORE the pointer f->p is
>> initialized.
>
> I can't see how this could happen, do you have an example test case?
The example used in my previous writeup show this:
https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689663.html
f->p = malloc (size);
***** With the approach B: the IL for the above is:
tmp1 = f->p;
tmp2 = &f->n;
tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);
tmp4 = malloc (size);
tmp3 = tmp4;
the above IL will be expanded to the following when .ACCESS_WITH_SIZE is
expanded
to its first argument:
tmp1 = f->p;
tmp2 = &f->n;
tmp3 = tmp1;
tmp4 = malloc (size);
tmp3 = tmp4;
the final optimized IL will be:
tmp3 = f->p;
tmp3 = malloc (size);
>From the above final IL, you can clearly see the undefined behavior introduced
>by tmp1 = f->p to the program.
Hope this is clear.
Qing
>
>> Based on the above, my conclusion are:
>> 1. It's not safe in general to pass the VALUE of the pointer f->p to the
>> call to
>> .ACCESS_WITH_SIZE.
>
> Given that .ACCESS_WITH_SIZE is always generated to replace a reference for
> f->p, I don't see how it could generate any *new* potentially uninitialized
> access. In fact, I would argue that the indirection in the .ACCESS_WITH_SIZE
> implementation was incorrect and was masked by the fact that &f->p and f->p
> mean the same thing for arrays and this got exposed when the same concept was
> extended to pointers.
>
> Looking at a simple example:
>
> ```
> typedef __SIZE_TYPE__ size_t;
>
> struct A
> {
> int n;
> char c[] __attribute__ ((counted_by (n)));
> };
>
> extern void * unknown_alloc (size_t);
> extern void do_something1 (const char *);
> extern void do_something2 ();
>
> size_t
> foo (size_t sz)
> {
> struct A *a = unknown_alloc (sz);
> a->n = sz;
>
> do_something1 (a->c);
> do_something2 ();
>
> return __builtin_dynamic_object_size (a->c, 0);
> }
> ```
>
> with -fdump-tree-original (which is what the frontend would have produced),
> we get:
>
> ```
> ;; Function foo (null)
> ;; enabled by -tree-original
>
>
> {
> struct A * a = (struct A *) unknown_alloc (sz);
>
> struct A * a = (struct A *) unknown_alloc (sz);
> a->n = (int) sz;
> do_something1 ((const char *) .ACCESS_WITH_SIZE ((char *) &a->c, &a->n, 1,
> 0, -1, 0B));
> do_something2 ();
> return (size_t) __builtin_dynamic_object_size ((const void *)
> .ACCESS_WITH_SIZE ((char *) &a->c, &a->n, 1, 0, -1, 0B), 0);
> }
> ```
>
> We see that:
>
> (1) the .ACCESS_WITH_SIZE is produced for each reference of a->c. this
> should not produce any new undefined behaviour; any new read of f->p will be
> exactly at the point of the read that the programmer intended and even if
> reordered, it won't be reordered above its initialization point.
>
> (2) A pointer to an array is coerced to a char */void * and the
> .ACCESS_WITH_SIZE simply passes it through. This only happens to work for
> arrays because a->c and &a->c mean the same thing, but it breaks for pointers
> because they're different things now.
>
> In the end, .ACCESS_WITH_SIZE is a nop, so it really shouldn't matter what is
> passed as long as it is effective at associating the FAM (or pointer) with
> its counted_by member, expressing their mutual data dependency.
>
> Thanks,
> Sid