Am Montag, dem 31.03.2025 um 13:59 -0700 schrieb Bill Wendling:
> > I'd like to offer up this to solve the issues we're facing. This is a
> > combination of everything that's been discussed here (or at least that
> > I've been able to read in the centi-thread :-).

Thanks! I think this proposal much better as it avoids undue burden
on parsers, but it does not address all my concerns.


>From my side, the issue about compromising the scoping rules of C
is also about unintended non-local effects of code changes. In
my opinion, a change in a library header elsewhere should not cause 
code in a local scope (which itself might also come from a macro) to
emit a warning or require a programmer to add a workaround. So I am
not convinced that adding warnings or a workaround such as
__builtin_global_ref  is a good solution.


I could see the following as a possible way forward: We only 
allow the following two syntaxes:

1. Single argument referring to a member.

__counted_by(len)

with an argument that must be a single identifier and where
the identifier then must refer to a struct member. 

(I still think this is not ideal and potentially
confusing, but in contrast to new scoping rules it is
at least relatively easily to explain as a special rule.).  


2. Forward declarations. 

__counted_by(size_t len; len + PADDING)

where then the second part can also be a more complicated 
expression, but with the explicit requirement that all
identifiers in this expression are then looked up according to
regular C language rules. So except for the forward declared
member(s) they are *never* looked up in the member namespace of
the struct, i.e. no new name lookup rules are introduced.


I think this could address my concerns about breaking
scoping in C. Still, I personally would prefer designator syntax
for both C and C++ as a nicer solution, and one that already
has some support from WG14.

Martin


> > 
> > ---
> > 
> > 1. The use of '__self' isn't feasible, so we won't use it. Instead,
> > we'll rely upon the current behavior—resolving any identifiers to the
> > "instance scope". This new scope is used __only__ in attributes, and
> > resolves identifiers to those in the least enclosing, non-anonymous
> > struct. For example:
> > 
> > struct foo {
> >   char count;
> >   struct bar {
> >     struct {
> >       int len;
> >     };
> >     struct {
> >       struct {
> >         int *valid_use __counted_by(len); // Valid.
> >       };
> >     };
> >     int *invalid_use __counted_by(count); // Invalid.
> >   } b;
> > };
> > 
> > Rationale: This is how '__guarded_by' currently resolves identifiers,
> > so there's precedence. And if we can't force its usage in all
> > situations, it's less a feature and more a "nicety" which will lead to
> > a massive discrepancy between compiler implementations. Despite the
> > fact that this introduces a new scoping mechanism to C, its use is not
> > as extensive as C++'s instance scoping and will apply only to
> > attributes. In the case where we have two different resolution
> > techniquest happening within the same structure (e.g. VLAs), we can
> > issue warnings as outlined in Yeoul's RFC[1].
> > 
> > 2. A method of forward declaring variables will be added for variables
> > that occur in the struct after the attribute. For example:
> > 
> > A: Necessary usage:
> > 
> > struct foo {
> >   int *buf __counted_by(char count; count);
> >   char count;
> > };
> > 
> > B: Unnecessary, but still valid, usage:
> > 
> > struct foo {
> >   char count;
> >   int *buf __counted_by(char count; count);
> > };
> > 
> > * The forward declaration is required in (A) but not in (B).
> > * The type of 'count' as declared in '__counted_by' *must* match the real 
> > type.
> > 
> > Rationale: This alleviates the issues of "double parsing" for
> > compilers that aren't able to handle it. (We can also remove the
> > '-fexperimental-late-parse-attributes' flag in Clang.)
> > 
> > 3. A new builtin '__builtin_global_ref()' (or similarly named) is
> > added to refer to variables outside of the most-enclosing structure.
> > Example:
> > 
> > int count_that_will_never_change_we_promise;
> > 
> > struct foo {
> >   int *bar 
> > __counted_by(__builtin_global_ref(count_that_will_never_change_we_promise));
> >   unsigned flags;
> > };
> > 
> > As Yeoul pointed out, there isn't a way to refer to variables that
> > have been shadowed, so the 'global' in '__builtin_global_ref' is a bit
> > of a misnomer as it could refer to a local variable.
> > 
> > Rationale: For those who need the flexibility to use variables outside
> > of the struct, this is an acceptable escape route. It does make bounds
> > checking less strict, though, as we can't track any modifications to
> > the global, so caution must be used.
> > 
> > Bonus suggestion (by yours truly):
> > 
> > I'd like the option to allow functions to calculate expressions (it
> > can be used for a single identifier too, but that's too heavy-handed).
> > It won't be required for an expression, but is a good way to avoid any
> > issues regarding '__builtin_global_ref', like variables shadowing the
> > global variable. Example:
> > 
> > int global;
> > 
> > struct foo;
> > static int counted_by_calc(struct foo *);
> > 
> > struct foo {
> >   char count;
> >   int fnord;
> >   int *buf __counted_by(counted_by_calc);
> > };
> > 
> > static int counted_by_calc(struct foo *ptr) __attribute__((pure)) {
> >   return ptr->count * (global << 42) - ptr->fnord;
> > }
> > 
> > A pointer to the current least enclosing, non-anonymous struct is
> > passed into 'counted_by_calc' by the compiler.
> > 
> > Rationale: This gets rid of all ambiguities when calculating an
> > expression. It's marked 'pure' so there should be no side-effects.
> > 
> > ---
> > 
> > I believe these suggestions cover everything we've discussed. Please
> > comment with anything I missed and your opinions on each.
> > 
> > [1] 
> > https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510
> > 
> > Share and enjoy!
> > -bw


Reply via email to