Re: [PATCH] tree-optimization/120929: Limit MEM_REF handling to .ACCESS_WITH_SIZE

Qing Zhao Tue, 22 Jul 2025 09:34:21 -0700

Hi,  

More thought on this over the weekend.   -:)

I summarized my thought in the following small writeup, please provide any 
comment 
or suggestion.

Thanks a lot!

Qing

===============================

Why it's wrong to pass the VALUE of the original pointer as the first argument 
to
the call to .ACCESS_WITH_SIZE  

For a pointer field with counted_by attribute:

struct S {
  int n;
  int *p __attribute__((counted_by(n)));
} *f;

f->p

if we pass the VALUE of the original pointer f->p as the first argument,
   and also return the original pointer:
 .ACCESS_WITH_SIZE (f->p, &f->n,...)

the IL for the above is:
  tmp1 = f->p;
  tmp2 = &f->n;
  tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);

In the above, in order to generate a call to .ACCESS_WITH_SIZE for the pointer
reference f->p,  we have to add the new GIMPLE tmp1 = f->p to pass the value of
the pointer f->p to the call to .ACCESS_WITH_SIZE.  
This new gimple reads the VALUE of the pointer f->p. It will introduce undefined
behavior if it is inserted in the program BEFORE the pointer f->p is 
initialized.

So, in order to NOT introduce undefined behavior with such code generation for  
.ACCESS_WITH_SIZE, we have to make sure that at the point we generate the call 
to
.ACCESS_WITH_SIZE, the pointer f->p has been initialized already in the program.

Question 1: Can we make sure this in C FE?  

I am not very sure, but I think that we cannot do it in C FE.
The reason is, the analysis for -Wuninitialized is done in the middle end with
data flow information available. Even in middle end with data flow, the analysis
is not 100% accurate, then how can we make sure this in C FE?   

Question 2: Can we move the code generation to the call to .ACCESS_WITH_SIZE in 
middle-end?

No.
The reason is, bound sanitizer needs information from .ACCESS_WITH_SIZE, and the
bound sanitizer instrumentation is done in the tree lowering pass in C FE. As
a result, the call to .ACCESS_WITH_SIZE need to be generated in C FE before the
tree lowering pass.

Based on the above, my conclusion are:

1. It's not safe in general to pass the VALUE of the pointer f->p to the call to
   .ACCESS_WITH_SIZE.  
2. We should use the other approach: pass the ADDRESS of the pointer f->p to the
   call to .ACCESS_WITH_SIZE for pointers with counted_by.  

Let me know if I miss anything.

Thanks a lot.

Qing

> On Jul 17, 2025, at 14:58, Qing Zhao <qing.z...@oracle.com> wrote:
> 
>> On Jul 17, 2025, at 11:40, Jakub Jelinek <ja...@redhat.com> wrote:
>> 
>> On Thu, Jul 17, 2025 at 03:26:05PM +0000, Qing Zhao wrote:
>>> How about add a new flag to distinguish these two cases, and put it to the 
>>> 3th argument:
>>> 
>>>  ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE,
>>>                    TYPE_OF_SIZE + ACCESS_MODE + IS_POINTER, TYPE_SIZE_UNIT 
>>> for element)
>>>  which returns the REF_TO_OBJ same as the 1st argument;
>>> 
>>>  1st argument REF_TO_OBJ: The reference to the object when IS_POINTER is 
>>> false;
>>>          The address of the reference to the object when IS_POINTER is true;
>>>  2nd argument REF_TO_SIZE: The reference to the size of the object,
>>>  3rd argument TYPE_OF_SIZE + ACCESS_MODE + IS_POINTER An integer constant 
>>> with a pointer
>>>    TYPE.
>>>    The pointee TYPE of the pointer TYPE is the TYPE of the object referenced
>>>       by REF_TO_SIZE.
>>>    The integer constant value represents the ACCESS_MODE + IS_POINTER:
>>>       00: none
>>>       01: read_only
>>>       10: write_only
>>>       11: read_write
>>>      100:  IS_POINTER
>> 
>> Sure, I was talking about it before, the value of the 3rd argument can be a
>> set of various bit flags.
>> I still don't understand what do you want to use that
>> read_only/write_only/read_write flags for,
> 
> It will be used for implementing the attribute “access” with 
> .ACCESS_WITH_SIZE: (a future work)
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-access-function-attribute
> 
> access (access-mode, ref-index) 
> 
> The ACCESS_MODE flag in the 3rd argument is for carrying the above 
> “access_mode” of the first argument
> of the “access” attribute.  
> 
> 
>> if you mark loads from the
>> pointer, those will be always reads and whether something is load, store or
>> load/store of data pointed by that pointer is something normally visible in
>> the IL (plus you probably don't know it in the FE).
> 
> Since I haven’t study this in details yet, not sure whether this is necessary 
> or not to encode it into 
> .ACCESS_WITH_SIZE.
> 
> Maybe just this flag now? And add it later if we really need it?
> 
>> 
>> So say for
>> struct S { int s; int *p __attribute__((counted_by (s))); };
>> 
>> int
>> foo (struct S *x, int y)
>> {
>> return x->p[y];
>> }
>> I would have expected you emit something like
>> _1 = x->p;
>> _6 = &x->s;
>> _5 = .ACCESS_WITH_SIZE (_1, _6, IS_POINTER, 4);
>> _2 = (long unsigned int) y;
>> _3 = _2 * 4;
>> _4 = _5 + _3;
>> D.2965 = *_4;
> 
> The above IL doesn’t require an additional IS_POINTER flag for the 
> .ACCESS_WITH_SIZE since 
> The first argument is still the pointer for the object, same as the FAM case.
> 
> With IS_POINTER flag added, we can pass the ADDRESS of the pointer as the 
> first argument to
> .ACCESS_WITH_SIZE, and also return the ADDRESS of the pointer for the pointer 
> with counted_by. i.e:
> 
> _1 = &x->p;
> _6 = &x->s;
> _5 = .ACCESS_WITH_SIZE (_1, _6, IS_POINTER, 4);
> _2 = (long unsigned int) y;
> _3 = _2 * 4;
> _4 = *_5 + _3;
> D.2965 = *_4;
> 
>> and for
>> x->p = whatever;
>> no .ACCESS_WITH_SIZE.
> 
> With the IS_POINTER flag, and pass and return the ADDRESS of the pointer to 
> .ACCESS_WITH_SIZE, 
> It will be no correctness issue when we generate .ACCESS_WITH_SIZE for the 
> above case. 
> The only issue is for such case, the call to .ACCESS_WITH_SIZE is useless. 
> 
> Is this reasonable?
> 
> Qing
> 
> 
> 
>

Re: [PATCH] tree-optimization/120929: Limit MEM_REF handling to .ACCESS_WITH_SIZE

Reply via email to