On Tue, Apr 1, 2025 at 8:29 AM Martin Uecker <uec...@tugraz.at> wrote: > Am Dienstag, dem 01.04.2025 um 15:01 +0000 schrieb Qing Zhao: > > > On Apr 1, 2025, at 10:04, Martin Uecker <uec...@tugraz.at> wrote: > > > Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling: > > > > > I'd like to offer up this to solve the issues we're facing. This is a > > > > > combination of everything that's been discussed here (or at least that > > > > > I've been able to read in the centi-thread :-). > > > > > > Thanks! I think this proposal much better as it avoids undue burden > > > on parsers, but it does not address all my concerns. > > > > > > > > > From my side, the issue about compromising the scoping rules of C > > > is also about unintended non-local effects of code changes. In > > > my opinion, a change in a library header elsewhere should not cause > > > code in a local scope (which itself might also come from a macro) to > > > emit a warning or require a programmer to add a workaround. So I am > > > not convinced that adding warnings or a workaround such as > > > __builtin_global_ref is a good solution.
To clarify, I'm not in favor of adding a generalized new scoping rule to C, but only for identifiers in attributes. From what I've seen in the discussions, I think that's what most people have been suggesting. This should limit any issues with changes in current code. > > > I could see the following as a possible way forward: We only > > > allow the following two syntaxes: > > > > > > 1. Single argument referring to a member. > > > > > > __counted_by(len) > > > > > > with an argument that must be a single identifier and where > > > the identifier then must refer to a struct member. > > > > > > (I still think this is not ideal and potentially > > > confusing, but in contrast to new scoping rules it is > > > at least relatively easily to explain as a special rule.). I'm wavering on whether it's going to be too confusing. The initial design was to use the bare struct member name, so that indicates to me that it's maybe a more natural way of referring to the 'count' field than not---at least with a single identifier. As Apple has stated, they've had no confusion among their developers with this type of identifier usage. > > > 2. Forward declarations. > > > > > > __counted_by(size_t len; len + PADDING) > > > > In the above, the PADDING is some constant? > > In principle - when considering only the name lookup rules - > it could be a constant, a global variable, or an automatic > variable, i.e. any ordinary identifiers which is visible at > this point. > > > > > More complicated expressions involving globals will not be supported? > > I think one could allow such expressions, But I think the > expressions should be restricted to expressions which have > no side effects. > > > > > > where then the second part can also be a more complicated > > > expression, but with the explicit requirement that all > > > identifiers in this expression are then looked up according to > > > regular C language rules. So except for the forward declared > > > member(s) they are *never* looked up in the member namespace of > > > the struct, i.e. no new name lookup rules are introduced. I'm on record as *hating* the idea of using expressions (apart from an identifier) in attributes. Having the compiler silently add a "ptr->" for every identifier in an expression seems like a hack to me. This is the reason I added my proposal for a function pointer to handle expressions. Even though I'm in the minority on this, I'd still like the option to use function pointers. > > One question here: > > > > What’s the major issue if we’d like to add one new scoping rule, for > > example, > > “Structure scope” (the same as the C++’s instance scope) to C? > > > > (In addition to the "VLA in structure" issue I mentioned in my previous > > writeup, > > is there any other issue to prevent this new scoping rule being added into > > C ?). > > Note that the "VLA in structure" is a bit of a red herring. The exact same > issues apply to lookup of any other ordinary identifiers in this context. > > enum { MY_BUF_SIZE = 100 }; > struct foo { > char buf[MY_BUF_SIZE]; > }; > > > C++ has instance scope for member functions. The rules for C++ are also > complex and not very consistent (see the examples I posted earlier, > demonstrating UB and compiler divergence). For C such a concept would > be new and much less useful, so the trade-off seems unfavorable (in > constrast to C++ where it is needed). I also see others issues: Fully > supporting instance scope would require changes to how C is parsed, > placing a burden on all C compilers and tooling. Depending on how you > specify it, it would also cause a change in semantics > for existing code, something C tries very hard to avoid. If you add > warnings as mitigation, it has the problem that it causes non-local > effects where introducing a name in in enclosing scope somewhere else > now necessitates a change to unrelated code, exactly what scoping rules > are meant to prevent. As I mentioned above (and alluded to in the original post), I think adding full C++ instance scoping to C is a Bad Idea(tm). What we do here would be a small subset of what's done in C++ and restricted to identifiers in attributes. > In any case, it seems a major change with many ramifications, including > possibly unintended ones. This should certainly not be done without > having a clear specification and support from WG14 (and probably not > done at all.) > > Martin > > > > > Qing > > > > > > > > > > > > > I think this could address my concerns about breaking > > > scoping in C. Still, I personally would prefer designator syntax > > > for both C and C++ as a nicer solution, and one that already > > > has some support from WG14. > > > > > > Martin > > > > > > > > > > > > > > > > --- > > > > > > > > > > 1. The use of '__self' isn't feasible, so we won't use it. Instead, > > > > > we'll rely upon the current behavior—resolving any identifiers to the > > > > > "instance scope". This new scope is used __only__ in attributes, and > > > > > resolves identifiers to those in the least enclosing, non-anonymous > > > > > struct. For example: > > > > > > > > > > struct foo { > > > > > char count; > > > > > struct bar { > > > > > struct { > > > > > int len; > > > > > }; > > > > > struct { > > > > > struct { > > > > > int *valid_use __counted_by(len); // Valid. > > > > > }; > > > > > }; > > > > > int *invalid_use __counted_by(count); // Invalid. > > > > > } b; > > > > > }; > > > > > > > > > > Rationale: This is how '__guarded_by' currently resolves identifiers, > > > > > so there's precedence. And if we can't force its usage in all > > > > > situations, it's less a feature and more a "nicety" which will lead to > > > > > a massive discrepancy between compiler implementations. Despite the > > > > > fact that this introduces a new scoping mechanism to C, its use is not > > > > > as extensive as C++'s instance scoping and will apply only to > > > > > attributes. In the case where we have two different resolution > > > > > techniquest happening within the same structure (e.g. VLAs), we can > > > > > issue warnings as outlined in Yeoul's RFC[1]. > > > > > > > > > > 2. A method of forward declaring variables will be added for variables > > > > > that occur in the struct after the attribute. For example: > > > > > > > > > > A: Necessary usage: > > > > > > > > > > struct foo { > > > > > int *buf __counted_by(char count; count); > > > > > char count; > > > > > }; > > > > > > > > > > B: Unnecessary, but still valid, usage: > > > > > > > > > > struct foo { > > > > > char count; > > > > > int *buf __counted_by(char count; count); > > > > > }; > > > > > > > > > > * The forward declaration is required in (A) but not in (B). > > > > > * The type of 'count' as declared in '__counted_by' *must* match the > > > > > real type. > > > > > > > > > > Rationale: This alleviates the issues of "double parsing" for > > > > > compilers that aren't able to handle it. (We can also remove the > > > > > '-fexperimental-late-parse-attributes' flag in Clang.) > > > > > > > > > > 3. A new builtin '__builtin_global_ref()' (or similarly named) is > > > > > added to refer to variables outside of the most-enclosing structure. > > > > > Example: > > > > > > > > > > int count_that_will_never_change_we_promise; > > > > > > > > > > struct foo { > > > > > int *bar > > > > > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise)); > > > > > unsigned flags; > > > > > }; > > > > > > > > > > As Yeoul pointed out, there isn't a way to refer to variables that > > > > > have been shadowed, so the 'global' in '__builtin_global_ref' is a bit > > > > > of a misnomer as it could refer to a local variable. > > > > > > > > > > Rationale: For those who need the flexibility to use variables outside > > > > > of the struct, this is an acceptable escape route. It does make bounds > > > > > checking less strict, though, as we can't track any modifications to > > > > > the global, so caution must be used. > > > > > > > > > > Bonus suggestion (by yours truly): > > > > > > > > > > I'd like the option to allow functions to calculate expressions (it > > > > > can be used for a single identifier too, but that's too heavy-handed). > > > > > It won't be required for an expression, but is a good way to avoid any > > > > > issues regarding '__builtin_global_ref', like variables shadowing the > > > > > global variable. Example: > > > > > > > > > > int global; > > > > > > > > > > struct foo; > > > > > static int counted_by_calc(struct foo *); > > > > > > > > > > struct foo { > > > > > char count; > > > > > int fnord; > > > > > int *buf __counted_by(counted_by_calc); > > > > > }; > > > > > > > > > > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) { > > > > > return ptr->count * (global << 42) - ptr->fnord; > > > > > } > > > > > > > > > > A pointer to the current least enclosing, non-anonymous struct is > > > > > passed into 'counted_by_calc' by the compiler. > > > > > > > > > > Rationale: This gets rid of all ambiguities when calculating an > > > > > expression. It's marked 'pure' so there should be no side-effects. > > > > > > > > > > --- > > > > > > > > > > I believe these suggestions cover everything we've discussed. Please > > > > > comment with anything I missed and your opinions on each. > > > > > > > > > > [1] > > > > > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510 > > > > > > > > > > Share and enjoy! > > > > > -bw > > > > > > > > > > -- > Univ.-Prof. Dr. rer. nat. Martin Uecker > Graz University of Technology > Institute of Biomedical Imaging > >