Am Dienstag, dem 01.04.2025 um 18:58 +0000 schrieb Qing Zhao: > > > On Apr 1, 2025, at 11:28, Martin Uecker <uec...@tugraz.at> wrote: > > > > Am Dienstag, dem 01.04.2025 um 15:01 +0000 schrieb Qing Zhao: > > > > > > > On Apr 1, 2025, at 10:04, Martin Uecker <uec...@tugraz.at> wrote: > > > > > > > > > > > > > > > > Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling: > > > > > > I'd like to offer up this to solve the issues we're facing. This is > > > > > > a > > > > > > combination of everything that's been discussed here (or at least > > > > > > that > > > > > > I've been able to read in the centi-thread :-). > > > > > > > > Thanks! I think this proposal much better as it avoids undue burden > > > > on parsers, but it does not address all my concerns. > > > > > > > > > > > > From my side, the issue about compromising the scoping rules of C > > > > is also about unintended non-local effects of code changes. In > > > > my opinion, a change in a library header elsewhere should not cause > > > > code in a local scope (which itself might also come from a macro) to > > > > emit a warning or require a programmer to add a workaround. So I am > > > > not convinced that adding warnings or a workaround such as > > > > __builtin_global_ref is a good solution. > > > > > > > > > > > > I could see the following as a possible way forward: We only > > > > allow the following two syntaxes: > > > > > > > > 1. Single argument referring to a member. > > > > > > > > __counted_by(len) > > > > > > > > with an argument that must be a single identifier and where > > > > the identifier then must refer to a struct member. > > > > > > > > (I still think this is not ideal and potentially > > > > confusing, but in contrast to new scoping rules it is > > > > at least relatively easily to explain as a special rule.). > > > > > > So, in allowed syntax 1, the identifier inside counted_by attribute will be > looked up inside > the structure. > > This is our current implementation of the counted_by for FAM and my previous > submitted > patch for counted_by for Pointers inside structures. > > Keeping this syntax is good. > > > > > > > > > 2. Forward declarations. > > > > > > > > __counted_by(size_t len; len + PADDING) > > > > > > In the above, the PADDING is some constant? > > > > In principle - when considering only the name lookup rules - > > it could be a constant, a global variable, or an automatic > > variable, i.e. any ordinary identifiers which is visible at > > this point. > > I am a little confused here: > Is this syntax 2 a new syntax, and with new name lookup rules other than the > syntax 1?
Yes. With the regular C name lookup rules other than syntax 1. > > How should the identifiers inside counted_by attribute with this syntax be > looked up? > Inside the structure first? Then if not found, looking up the outer scope for > identifiers in the > PADDING part? The identifier in the forward declaration ("len") will be looked up in the structure and will be made available when parsing the expression. Any other identifiers (such as "PADDING") will not be looked up in the structure. So it is always clear where each identifier is going to be looked up. > Then, has a new scoping been introduced now? > Or some other special looking up rules for counted_by attribute? > > > > > > > > > More complicated expressions involving globals will not be supported? > > > > I think one could allow such expressions, But I think the > > expressions should be restricted to expressions which have > > no side effects. > > See my question in above, does this new syntax 2 introduce a new “structure > scope” to enable > the identifiers to be looked up inside the structure first as syntax 1? Or, > this new syntax has the > same lookup rule as the current C, will NOT look up inside the structure > first? It will NOT look into the structure, except for the forward declared identifier. Martin > > > > > > > > > > where then the second part can also be a more complicated > > > > expression, but with the explicit requirement that all > > > > identifiers in this expression are then looked up according to > > > > regular C language rules. So except for the forward declared > > > > member(s) they are *never* looked up in the member namespace of > > > > the struct, i.e. no new name lookup rules are introduced. > > > > > > One question here: > > > > > > What’s the major issue if we’d like to add one new scoping rule, for > > > example, > > > “Structure scope” (the same as the C++’s instance scope) to C? > > > > > > (In addition to the "VLA in structure" issue I mentioned in my previous > > > writeup, > > > is there any other issue to prevent this new scoping rule being added > > > into C ?). > > > > Note that the "VLA in structure" is a bit of a red herring. The exact same > > issues apply to lookup of any other ordinary identifiers in this context. > > > > enum { MY_BUF_SIZE = 100 }; > > struct foo { > > char buf[MY_BUF_SIZE]; > > }; > > > Yes, this is because there is NO “structure scope” available in C. As long as > the “structure scope” > is added into C, identifiers could be looked up inside the “structure scope” > first before looking up > outer scopes. > > > > > C++ has instance scope for member functions. The rules for C++ are also > > complex and not very consistent (see the examples I posted earlier, > > demonstrating UB and compiler divergence). > > Yes, I studied those C++ examples when I wrote the proposal. And my > observation > was: in C++, the instance scope always has higher priority than local and > global scopes. > i.e, when there is a conflict between instance scope and local/global scope > for the identifier, > The identifier within the instance scope will shadow the one with the same > name in the > outer scope. > > But in C, there is No concept of “structure scope” at all. Identifiers will > NOT looked up > inside a structure at all. > > > For C such a concept would > > be new and much less useful, so the trade-off seems unfavorable (in > > constrast to C++ where it is needed). > > This concept is needed when referring a member variable inside the structure > is needed, > Such as the counted_by attribute, or later when we extend C language to > include the bound info > Into the TYPE. > > But I agree with you that introducing a new instance scope into C might be > too risky. > > > > I also see others issues: Fully > > supporting instance scope would require changes to how C is parsed, > > placing a burden on all C compilers and tooling. Depending on how you > > specify it, it would also cause a change in semantics > > for existing code, something C tries very hard to avoid. > > Yes, agreed. > Introducing a new instance scope in C might be too risky, therefore not worth > to > do it. > > > > If you add > > warnings as mitigation, it has the problem that it causes non-local > > effects where introducing a name in in enclosing scope somewhere else > > now necessitates a change to unrelated code, exactly what scoping rules > > are meant to prevent. > > Yes, that’s right. > > > > In any case, it seems a major change with many ramifications, including > > possibly unintended ones. This should certainly not be done without > > having a clear specification and support from WG14 (and probably not > > done at all.) > > Yes, I agree. > > Qing > > > > Martin > > > > > > > > Qing > > > > > > > > > > > > > > > > > > I think this could address my concerns about breaking > > > > scoping in C. Still, I personally would prefer designator syntax > > > > for both C and C++ as a nicer solution, and one that already > > > > has some support from WG14. > > > > > > > > Martin > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > > > > > 1. The use of '__self' isn't feasible, so we won't use it. Instead, > > > > > > we'll rely upon the current behavior—resolving any identifiers to > > > > > > the > > > > > > "instance scope". This new scope is used __only__ in attributes, and > > > > > > resolves identifiers to those in the least enclosing, non-anonymous > > > > > > struct. For example: > > > > > > > > > > > > struct foo { > > > > > > char count; > > > > > > struct bar { > > > > > > struct { > > > > > > int len; > > > > > > }; > > > > > > struct { > > > > > > struct { > > > > > > int *valid_use __counted_by(len); // Valid. > > > > > > }; > > > > > > }; > > > > > > int *invalid_use __counted_by(count); // Invalid. > > > > > > } b; > > > > > > }; > > > > > > > > > > > > Rationale: This is how '__guarded_by' currently resolves > > > > > > identifiers, > > > > > > so there's precedence. And if we can't force its usage in all > > > > > > situations, it's less a feature and more a "nicety" which will lead > > > > > > to > > > > > > a massive discrepancy between compiler implementations. Despite the > > > > > > fact that this introduces a new scoping mechanism to C, its use is > > > > > > not > > > > > > as extensive as C++'s instance scoping and will apply only to > > > > > > attributes. In the case where we have two different resolution > > > > > > techniquest happening within the same structure (e.g. VLAs), we can > > > > > > issue warnings as outlined in Yeoul's RFC[1]. > > > > > > > > > > > > 2. A method of forward declaring variables will be added for > > > > > > variables > > > > > > that occur in the struct after the attribute. For example: > > > > > > > > > > > > A: Necessary usage: > > > > > > > > > > > > struct foo { > > > > > > int *buf __counted_by(char count; count); > > > > > > char count; > > > > > > }; > > > > > > > > > > > > B: Unnecessary, but still valid, usage: > > > > > > > > > > > > struct foo { > > > > > > char count; > > > > > > int *buf __counted_by(char count; count); > > > > > > }; > > > > > > > > > > > > * The forward declaration is required in (A) but not in (B). > > > > > > * The type of 'count' as declared in '__counted_by' *must* match > > > > > > the real type. > > > > > > > > > > > > Rationale: This alleviates the issues of "double parsing" for > > > > > > compilers that aren't able to handle it. (We can also remove the > > > > > > '-fexperimental-late-parse-attributes' flag in Clang.) > > > > > > > > > > > > 3. A new builtin '__builtin_global_ref()' (or similarly named) is > > > > > > added to refer to variables outside of the most-enclosing structure. > > > > > > Example: > > > > > > > > > > > > int count_that_will_never_change_we_promise; > > > > > > > > > > > > struct foo { > > > > > > int *bar > > > > > > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise)); > > > > > > unsigned flags; > > > > > > }; > > > > > > > > > > > > As Yeoul pointed out, there isn't a way to refer to variables that > > > > > > have been shadowed, so the 'global' in '__builtin_global_ref' is a > > > > > > bit > > > > > > of a misnomer as it could refer to a local variable. > > > > > > > > > > > > Rationale: For those who need the flexibility to use variables > > > > > > outside > > > > > > of the struct, this is an acceptable escape route. It does make > > > > > > bounds > > > > > > checking less strict, though, as we can't track any modifications to > > > > > > the global, so caution must be used. > > > > > > > > > > > > Bonus suggestion (by yours truly): > > > > > > > > > > > > I'd like the option to allow functions to calculate expressions (it > > > > > > can be used for a single identifier too, but that's too > > > > > > heavy-handed). > > > > > > It won't be required for an expression, but is a good way to avoid > > > > > > any > > > > > > issues regarding '__builtin_global_ref', like variables shadowing > > > > > > the > > > > > > global variable. Example: > > > > > > > > > > > > int global; > > > > > > > > > > > > struct foo; > > > > > > static int counted_by_calc(struct foo *); > > > > > > > > > > > > struct foo { > > > > > > char count; > > > > > > int fnord; > > > > > > int *buf __counted_by(counted_by_calc); > > > > > > }; > > > > > > > > > > > > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) { > > > > > > return ptr->count * (global << 42) - ptr->fnord; > > > > > > } > > > > > > > > > > > > A pointer to the current least enclosing, non-anonymous struct is > > > > > > passed into 'counted_by_calc' by the compiler. > > > > > > > > > > > > Rationale: This gets rid of all ambiguities when calculating an > > > > > > expression. It's marked 'pure' so there should be no side-effects. > > > > > > > > > > > > --- > > > > > > > > > > > > I believe these suggestions cover everything we've discussed. Please > > > > > > comment with anything I missed and your opinions on each. > > > > > > > > > > > > [1] > > > > > > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510 > > > > > > > > > > > > Share and enjoy! > > > > > > -bw > > > > > > > > > > > > > > > -- > > Univ.-Prof. Dr. rer. nat. Martin Uecker > > Graz University of Technology > > Institute of Biomedical Imaging > >