Hi, More thought on this over the weekend. -:)
I summarized my thought in the following small writeup, please provide any comment or suggestion. Thanks a lot! Qing =============================== Why it's wrong to pass the VALUE of the original pointer as the first argument to the call to .ACCESS_WITH_SIZE For a pointer field with counted_by attribute: struct S { int n; int *p __attribute__((counted_by(n))); } *f; f->p if we pass the VALUE of the original pointer f->p as the first argument, and also return the original pointer: .ACCESS_WITH_SIZE (f->p, &f->n,...) the IL for the above is: tmp1 = f->p; tmp2 = &f->n; tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...); In the above, in order to generate a call to .ACCESS_WITH_SIZE for the pointer reference f->p, we have to add the new GIMPLE tmp1 = f->p to pass the value of the pointer f->p to the call to .ACCESS_WITH_SIZE. This new gimple reads the VALUE of the pointer f->p. It will introduce undefined behavior if it is inserted in the program BEFORE the pointer f->p is initialized. So, in order to NOT introduce undefined behavior with such code generation for .ACCESS_WITH_SIZE, we have to make sure that at the point we generate the call to .ACCESS_WITH_SIZE, the pointer f->p has been initialized already in the program. Question 1: Can we make sure this in C FE? I am not very sure, but I think that we cannot do it in C FE. The reason is, the analysis for -Wuninitialized is done in the middle end with data flow information available. Even in middle end with data flow, the analysis is not 100% accurate, then how can we make sure this in C FE? Question 2: Can we move the code generation to the call to .ACCESS_WITH_SIZE in middle-end? No. The reason is, bound sanitizer needs information from .ACCESS_WITH_SIZE, and the bound sanitizer instrumentation is done in the tree lowering pass in C FE. As a result, the call to .ACCESS_WITH_SIZE need to be generated in C FE before the tree lowering pass. Based on the above, my conclusion are: 1. It's not safe in general to pass the VALUE of the pointer f->p to the call to .ACCESS_WITH_SIZE. 2. We should use the other approach: pass the ADDRESS of the pointer f->p to the call to .ACCESS_WITH_SIZE for pointers with counted_by. Let me know if I miss anything. Thanks a lot. Qing > On Jul 17, 2025, at 14:58, Qing Zhao <qing.z...@oracle.com> wrote: > >> On Jul 17, 2025, at 11:40, Jakub Jelinek <ja...@redhat.com> wrote: >> >> On Thu, Jul 17, 2025 at 03:26:05PM +0000, Qing Zhao wrote: >>> How about add a new flag to distinguish these two cases, and put it to the >>> 3th argument: >>> >>> ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, >>> TYPE_OF_SIZE + ACCESS_MODE + IS_POINTER, TYPE_SIZE_UNIT >>> for element) >>> which returns the REF_TO_OBJ same as the 1st argument; >>> >>> 1st argument REF_TO_OBJ: The reference to the object when IS_POINTER is >>> false; >>> The address of the reference to the object when IS_POINTER is true; >>> 2nd argument REF_TO_SIZE: The reference to the size of the object, >>> 3rd argument TYPE_OF_SIZE + ACCESS_MODE + IS_POINTER An integer constant >>> with a pointer >>> TYPE. >>> The pointee TYPE of the pointer TYPE is the TYPE of the object referenced >>> by REF_TO_SIZE. >>> The integer constant value represents the ACCESS_MODE + IS_POINTER: >>> 00: none >>> 01: read_only >>> 10: write_only >>> 11: read_write >>> 100: IS_POINTER >> >> Sure, I was talking about it before, the value of the 3rd argument can be a >> set of various bit flags. >> I still don't understand what do you want to use that >> read_only/write_only/read_write flags for, > > It will be used for implementing the attribute “access” with > .ACCESS_WITH_SIZE: (a future work) > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-access-function-attribute > > access (access-mode, ref-index) > > The ACCESS_MODE flag in the 3rd argument is for carrying the above > “access_mode” of the first argument > of the “access” attribute. > > >> if you mark loads from the >> pointer, those will be always reads and whether something is load, store or >> load/store of data pointed by that pointer is something normally visible in >> the IL (plus you probably don't know it in the FE). > > Since I haven’t study this in details yet, not sure whether this is necessary > or not to encode it into > .ACCESS_WITH_SIZE. > > Maybe just this flag now? And add it later if we really need it? > >> >> So say for >> struct S { int s; int *p __attribute__((counted_by (s))); }; >> >> int >> foo (struct S *x, int y) >> { >> return x->p[y]; >> } >> I would have expected you emit something like >> _1 = x->p; >> _6 = &x->s; >> _5 = .ACCESS_WITH_SIZE (_1, _6, IS_POINTER, 4); >> _2 = (long unsigned int) y; >> _3 = _2 * 4; >> _4 = _5 + _3; >> D.2965 = *_4; > > The above IL doesn’t require an additional IS_POINTER flag for the > .ACCESS_WITH_SIZE since > The first argument is still the pointer for the object, same as the FAM case. > > With IS_POINTER flag added, we can pass the ADDRESS of the pointer as the > first argument to > .ACCESS_WITH_SIZE, and also return the ADDRESS of the pointer for the pointer > with counted_by. i.e: > > _1 = &x->p; > _6 = &x->s; > _5 = .ACCESS_WITH_SIZE (_1, _6, IS_POINTER, 4); > _2 = (long unsigned int) y; > _3 = _2 * 4; > _4 = *_5 + _3; > D.2965 = *_4; > >> and for >> x->p = whatever; >> no .ACCESS_WITH_SIZE. > > With the IS_POINTER flag, and pass and return the ADDRESS of the pointer to > .ACCESS_WITH_SIZE, > It will be no correctness issue when we generate .ACCESS_WITH_SIZE for the > above case. > The only issue is for such case, the call to .ACCESS_WITH_SIZE is useless. > > Is this reasonable? > > Qing > > > >