> On Jul 22, 2025, at 20:12, Siddhesh Poyarekar <siddh...@gotplt.org> wrote:
> 
> [Apologies if I've missed some context in my reading since I'm coming back to 
> this after a big break]
> 
> On 2025-07-22 12:33, Qing Zhao wrote:
>> Why it's wrong to pass the VALUE of the original pointer as the first 
>> argument to
>> the call to .ACCESS_WITH_SIZE
>> For a pointer field with counted_by attribute:
>> struct S {
>>   int n;
>>   int *p __attribute__((counted_by(n)));
>> } *f;
>> f->p
>> if we pass the VALUE of the original pointer f->p as the first argument,
>>    and also return the original pointer:
>>  .ACCESS_WITH_SIZE (f->p, &f->n,...)
>> the IL for the above is:
>>   tmp1 = f->p;
>>   tmp2 = &f->n;
>>   tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);
>> In the above, in order to generate a call to .ACCESS_WITH_SIZE for the 
>> pointer
>> reference f->p,  we have to add the new GIMPLE tmp1 = f->p to pass the value 
>> of
>> the pointer f->p to the call to .ACCESS_WITH_SIZE.
>> This new gimple reads the VALUE of the pointer f->p. It will introduce 
>> undefined
>> behavior if it is inserted in the program BEFORE the pointer f->p is 
>> initialized.
> 
> I can't see how this could happen, do you have an example test case?

The example used in my previous writeup show this:
https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689663.html

f->p = malloc (size);  
***** With the approach B: the IL for the above is:
 tmp1 = f->p;
 tmp2 = &f->n;
 tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);
 tmp4 = malloc (size);
 tmp3 = tmp4;

the above IL will be expanded to the following when .ACCESS_WITH_SIZE is 
expanded
to its first argument:

 tmp1 = f->p;
 tmp2 = &f->n;
 tmp3 = tmp1;
 tmp4 = malloc (size);
 tmp3 = tmp4;

the final optimized IL will be:  
 tmp3 = f->p;
 tmp3 = malloc (size);

>From the above final IL, you can clearly see the undefined behavior introduced 
>by tmp1 = f->p to the program.

Hope this is clear. 

Qing

> 
>> Based on the above, my conclusion are:
>> 1. It's not safe in general to pass the VALUE of the pointer f->p to the 
>> call to
>>    .ACCESS_WITH_SIZE.
> 
> Given that .ACCESS_WITH_SIZE is always generated to replace a reference for 
> f->p, I don't see how it could generate any *new* potentially uninitialized 
> access.  In fact, I would argue that the indirection in the .ACCESS_WITH_SIZE 
> implementation was incorrect and was masked by the fact that &f->p and f->p 
> mean the same thing for arrays and this got exposed when the same concept was 
> extended to pointers.
> 
> Looking at a simple example:
> 
> ```
> typedef __SIZE_TYPE__ size_t;
> 
> struct A
> {
>  int n;
>  char c[] __attribute__ ((counted_by (n)));
> };
> 
> extern void * unknown_alloc (size_t);
> extern void do_something1 (const char *);
> extern void do_something2 ();
> 
> size_t
> foo (size_t sz)
> {
>  struct A *a = unknown_alloc (sz);
>  a->n = sz;
> 
>  do_something1 (a->c);
>  do_something2 ();
> 
>  return __builtin_dynamic_object_size (a->c, 0);
> }
> ```
> 
> with -fdump-tree-original (which is what the frontend would have produced), 
> we get:
> 
> ```
> ;; Function foo (null)
> ;; enabled by -tree-original
> 
> 
> {
>  struct A * a = (struct A *) unknown_alloc (sz);
> 
>    struct A * a = (struct A *) unknown_alloc (sz);
>  a->n = (int) sz;
>  do_something1 ((const char *) .ACCESS_WITH_SIZE ((char *) &a->c, &a->n, 1, 
> 0, -1, 0B));
>  do_something2 ();
>  return (size_t) __builtin_dynamic_object_size ((const void *) 
> .ACCESS_WITH_SIZE ((char *) &a->c, &a->n, 1, 0, -1, 0B), 0);
> }
> ```
> 
> We see that:
> 
> (1) the .ACCESS_WITH_SIZE is produced for each reference of a->c.  this 
> should not produce any new undefined behaviour; any new read of f->p will be 
> exactly at the point of the read that the programmer intended and even if 
> reordered, it won't be reordered above its initialization point.
> 
> (2) A pointer to an array is coerced to a char */void * and the 
> .ACCESS_WITH_SIZE simply passes it through.  This only happens to work for 
> arrays because a->c and &a->c mean the same thing, but it breaks for pointers 
> because they're different things now.
> 
> In the end, .ACCESS_WITH_SIZE is a nop, so it really shouldn't matter what is 
> passed as long as it is effective at associating the FAM (or pointer) with 
> its counted_by member, expressing their mutual data dependency.
> 
> Thanks,
> Sid

Reply via email to