I'd like to offer up this to solve the issues we're facing. This is a
combination of everything that's been discussed here (or at least that
I've been able to read in the centi-thread :-).

---

1. The use of '__self' isn't feasible, so we won't use it. Instead,
we'll rely upon the current behavior—resolving any identifiers to the
"instance scope". This new scope is used __only__ in attributes, and
resolves identifiers to those in the least enclosing, non-anonymous
struct. For example:

struct foo {
  char count;
  struct bar {
    struct {
      int len;
    };
    struct {
      struct {
        int *valid_use __counted_by(len); // Valid.
      };
    };
    int *invalid_use __counted_by(count); // Invalid.
  } b;
};

Rationale: This is how '__guarded_by' currently resolves identifiers,
so there's precedence. And if we can't force its usage in all
situations, it's less a feature and more a "nicety" which will lead to
a massive discrepancy between compiler implementations. Despite the
fact that this introduces a new scoping mechanism to C, its use is not
as extensive as C++'s instance scoping and will apply only to
attributes. In the case where we have two different resolution
techniquest happening within the same structure (e.g. VLAs), we can
issue warnings as outlined in Yeoul's RFC[1].

2. A method of forward declaring variables will be added for variables
that occur in the struct after the attribute. For example:

A: Necessary usage:

struct foo {
  int *buf __counted_by(char count; count);
  char count;
};

B: Unnecessary, but still valid, usage:

struct foo {
  char count;
  int *buf __counted_by(char count; count);
};

* The forward declaration is required in (A) but not in (B).
* The type of 'count' as declared in '__counted_by' *must* match the real type.

Rationale: This alleviates the issues of "double parsing" for
compilers that aren't able to handle it. (We can also remove the
'-fexperimental-late-parse-attributes' flag in Clang.)

3. A new builtin '__builtin_global_ref()' (or similarly named) is
added to refer to variables outside of the most-enclosing structure.
Example:

int count_that_will_never_change_we_promise;

struct foo {
  int *bar 
__counted_by(__builtin_global_ref(count_that_will_never_change_we_promise));
  unsigned flags;
};

As Yeoul pointed out, there isn't a way to refer to variables that
have been shadowed, so the 'global' in '__builtin_global_ref' is a bit
of a misnomer as it could refer to a local variable.

Rationale: For those who need the flexibility to use variables outside
of the struct, this is an acceptable escape route. It does make bounds
checking less strict, though, as we can't track any modifications to
the global, so caution must be used.

Bonus suggestion (by yours truly):

I'd like the option to allow functions to calculate expressions (it
can be used for a single identifier too, but that's too heavy-handed).
It won't be required for an expression, but is a good way to avoid any
issues regarding '__builtin_global_ref', like variables shadowing the
global variable. Example:

int global;

struct foo;
static int counted_by_calc(struct foo *);

struct foo {
  char count;
  int fnord;
  int *buf __counted_by(counted_by_calc);
};

static int counted_by_calc(struct foo *ptr) __attribute__((pure)) {
  return ptr->count * (global << 42) - ptr->fnord;
}

A pointer to the current least enclosing, non-anonymous struct is
passed into 'counted_by_calc' by the compiler.

Rationale: This gets rid of all ambiguities when calculating an
expression. It's marked 'pure' so there should be no side-effects.

---

I believe these suggestions cover everything we've discussed. Please
comment with anything I missed and your opinions on each.

[1] 
https://discourse.llvm.org/t/rfc-forward-referencing-a-struct-member-within-bounds-annotations/85510

Share and enjoy!
-bw

Reply via email to