Jakub, Richard and Joseph, Could you please help on the question below:
Whether it’s able to distinguish a reference “p->f” is a read from or a write to in C FE? Please see the following example: > On Jul 23, 2025, at 13:19, Siddhesh Poyarekar <siddh...@gotplt.org> wrote: > > On 2025-07-23 13:11, Qing Zhao wrote: >>> On Jul 23, 2025, at 12:55, Siddhesh Poyarekar <siddh...@gotplt.org> wrote: >>> >>> On 2025-07-23 11:08, Qing Zhao wrote: >>>> We always generate a call to .ACCESS_WITH_SIZE for every f->p whatever >>>> it’s a reference >>>> or a definition in C FE parser. (This is the case for FAM) >>> >>> Hmm, that's not correct then and I have been misreading the code all this >>> while; .ACCESS_WITH_SIZE does not make sense for a definition because the >>> size in question may or may not even be correct w.r.t. the definition at >>> the point that it is emitted. I reckon this too does not come up in the >>> FAM case because the FAM is not allocated separately from its containing >>> object, whereas in the case of a pointer, it is allocated separately. So... >>> >>>> For pointer with counted by, Yes, if we can determine whether a f->p is >>>> an object reference or >>>> an object definition, and ONLY emitting .ACCESS_WITH_SIZE for the f->p >>>> when it’s a object reference, >>>> then the approach that passes the VALUE of f->p to .ACCESS_WITH_SIZE is >>>> safe. >>>> However, I have two questions for this: >>>> Question 1: Can we make sure this in C FE? (Determine whether a f->p is >>>> an object reference of an object definition) >>> >>> ... it probably makes sense to focus on resolving this question to make >>> sure that .ACCESS_WITH_SIZE is generated for a reference only when the >>> reference is a read and not when it's a write because that's the only place >>> where the call actually has a correct meaning. >> Yes, this is the most important question we need to answer first to move on >> to the next step: >> Whether it’s able to distinguish a reference “p->f” is a read or a write to >> in C FE? >> If Yes, how to do this? >> For the following example: struct S { int n; int *p __attribute__((counted_by(n))); } *f; Int *g; void setup (int **ptr, int count) { *ptr = __builtin_malloc (sizeof (int) * count); g = *ptr; }; int main () { f = __builtin_malloc (sizeof (struct S)); setup (&f->p, 10); } >> In the above code, Is the “f->p” in the above example a read or a write when >> we are in C FE? >> Then, should we emit .ACCESS_WITH_SIZE for it or not? > > Strictly from a correctness perspective, this shouldn't be wrapped around a > .ACCESS_FOR_SIZE. Inside the routine “setup”, there is a "write to” and also a “read from” the pointer *ptr. So, for the “&f->p” that was passed to “setup” at the call site, this pointer *(&f->p), i.e., f->p, is read from and also is written to. However, ONLY with Inter-procedural analysis and data flow information we can make sure this. C FE has no such capability to determine whether the f->p is a read or a write. Is this right? > However, I don't know how, maybe someone else could help with that. > > I wonder if it would make sense to make the .ACCESS_WITH_SIZE use much > narrower, e.g. by emitting it only for __builtin_dynamic_object_size. in addition to __bdos, array bounds sanitizer uses .ACCESS_WITH_SIZE too. And we might have other consumers later. The potential risk for this approach are: 1. some opportunities might be missed. 2. We have to incrementally insert call to .ACCESS_WITH_SIZE later for new consumers. > That is, see if the input PTR to the function is a struct component ref with > a counted_by and only then emit .ACCESS_WITH_SIZE right before the > __builtin_dynamic_object_size call. One would have to walk through the input > PTR to see if it reaches a component_ref with a counted_by. Can we do this in C FE? Since we need to walk through the def-use chain, right? Qing > > Thanks, > Sid