Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling:
> > I'd like to offer up this to solve the issues we're facing. This is a
> > combination of everything that's been discussed here (or at least that
> > I've been able to read in the centi-thread :-).
Thanks! I think this proposal much better as it avoids undue burden
on parsers, but it does not address all my concerns.
>From my side, the issue about compromising the scoping rules of C
is also about unintended non-local effects of code changes. In
my opinion, a change in a library header elsewhere should not cause
code in a local scope (which itself might also come from a macro) to
emit a warning or require a programmer to add a workaround. So I am
not convinced that adding warnings or a workaround such as
__builtin_global_ref is a good solution.
I could see the following as a possible way forward: We only
allow the following two syntaxes:
1. Single argument referring to a member.
__counted_by(len)
with an argument that must be a single identifier and where
the identifier then must refer to a struct member.
(I still think this is not ideal and potentially
confusing, but in contrast to new scoping rules it is
at least relatively easily to explain as a special rule.).
2. Forward declarations.
__counted_by(size_t len; len + PADDING)
where then the second part can also be a more complicated
expression, but with the explicit requirement that all
identifiers in this expression are then looked up according to
regular C language rules. So except for the forward declared
member(s) they are *never* looked up in the member namespace of
the struct, i.e. no new name lookup rules are introduced.
I think this could address my concerns about breaking
scoping in C. Still, I personally would prefer designator syntax
for both C and C++ as a nicer solution, and one that already
has some support from WG14.
Martin
> >
> > ---
> >
> > 1. The use of '__self' isn't feasible, so we won't use it. Instead,
> > we'll rely upon the current behavior—resolving any identifiers to the
> > "instance scope". This new scope is used __only__ in attributes, and
> > resolves identifiers to those in the least enclosing, non-anonymous
> > struct. For example:
> >
> > struct foo {
> > char count;
> > struct bar {
> > struct {
> > int len;
> > };
> > struct {
> > struct {
> > int *valid_use __counted_by(len); // Valid.
> > };
> > };
> > int *invalid_use __counted_by(count); // Invalid.
> > } b;
> > };
> >
> > Rationale: This is how '__guarded_by' currently resolves identifiers,
> > so there's precedence. And if we can't force its usage in all
> > situations, it's less a feature and more a "nicety" which will lead to
> > a massive discrepancy between compiler implementations. Despite the
> > fact that this introduces a new scoping mechanism to C, its use is not
> > as extensive as C++'s instance scoping and will apply only to
> > attributes. In the case where we have two different resolution
> > techniquest happening within the same structure (e.g. VLAs), we can
> > issue warnings as outlined in Yeoul's RFC[1].
> >
> > 2. A method of forward declaring variables will be added for variables
> > that occur in the struct after the attribute. For example:
> >
> > A: Necessary usage:
> >
> > struct foo {
> > int *buf __counted_by(char count; count);
> > char count;
> > };
> >
> > B: Unnecessary, but still valid, usage:
> >
> > struct foo {
> > char count;
> > int *buf __counted_by(char count; count);
> > };
> >
> > * The forward declaration is required in (A) but not in (B).
> > * The type of 'count' as declared in '__counted_by' *must* match the real
> > type.
> >
> > Rationale: This alleviates the issues of "double parsing" for
> > compilers that aren't able to handle it. (We can also remove the
> > '-fexperimental-late-parse-attributes' flag in Clang.)
> >
> > 3. A new builtin '__builtin_global_ref()' (or similarly named) is
> > added to refer to variables outside of the most-enclosing structure.
> > Example:
> >
> > int count_that_will_never_change_we_promise;
> >
> > struct foo {
> > int *bar
> > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise));
> > unsigned flags;
> > };
> >
> > As Yeoul pointed out, there isn't a way to refer to variables that
> > have been shadowed, so the 'global' in '__builtin_global_ref' is a bit
> > of a misnomer as it could refer to a local variable.
> >
> > Rationale: For those who need the flexibility to use variables outside
> > of the struct, this is an acceptable escape route. It does make bounds
> > checking less strict, though, as we can't track any modifications to
> > the global, so caution must be used.
> >
> > Bonus suggestion (by yours truly):
> >
> > I'd like the option to allow functions to calculate expressions (it
> > can be used for a single identifier too, but that's too heavy-handed).
> > It won't be required for an expression, but is a good way to avoid any
> > issues regarding '__builtin_global_ref', like variables shadowing the
> > global variable. Example:
> >
> > int global;
> >
> > struct foo;
> > static int counted_by_calc(struct foo *);
> >
> > struct foo {
> > char count;
> > int fnord;
> > int *buf __counted_by(counted_by_calc);
> > };
> >
> > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) {
> > return ptr->count * (global << 42) - ptr->fnord;
> > }
> >
> > A pointer to the current least enclosing, non-anonymous struct is
> > passed into 'counted_by_calc' by the compiler.
> >
> > Rationale: This gets rid of all ambiguities when calculating an
> > expression. It's marked 'pure' so there should be no side-effects.
> >
> > ---
> >
> > I believe these suggestions cover everything we've discussed. Please
> > comment with anything I missed and your opinions on each.
> >
> > [1]
> > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510
> >
> > Share and enjoy!
> > -bw