Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Bill Wendling Fri, 25 Jul 2025 20:44:05 -0700

On Thu, Jul 24, 2025 at 2:53 PM Martin Uecker <ma.uec...@gmail.com> wrote:
> Am Donnerstag, dem 24.07.2025 um 14:25 -0700 schrieb Bill Wendling:
> > On Thu, Jul 24, 2025 at 8:03 AM Martin Uecker <ma.uec...@gmail.com> wrote:
> > > Am Donnerstag, dem 24.07.2025 um 14:08 +0000 schrieb Aaron Ballman:
> > > > On Wed, Jul 23, 2025 at 8:38 PM Martin Uecker <ma.uec...@gmail.com> 
> > > > wrote:
> > > TBH, I am not terrible convinced about this argument.
> > >
> > > If I understood it correctly, the late parsing design seems to make
> > > no distinctions between which identifiers is used, the local or
> > > the global one and just prefers the local one if it exists, possibly
> > > giving a warning if there is also a global one.
> >
> > Yes...kinda. The order of name lookup would essentially be: field
> > within a struct, any non-global scope, global scope. (As Kees pointed
> > out, there would be times when we need to support function calls and
> > counts in sub-structs, but those are handled by this convention.) The
> > only part of this ordering that *isn't* part of normal C identifier
> > resolution is the "field within a struct" part.
> >
> The other big thing were it diverges from normal C identifier lookup
> (and also C++ for most cases) which you miss in the description above,
> is that it would pick up an identifier that comes later in the scope.
>
> The following code prints 10 and not 20.  I think this is the much
> bigger and more severe divergence.
>
> int main()
> {
>     int n = 10;
>     {
>         printf("%d\n", n);
>         int n = 20;
>     }
> }


I brought this exact argument up early on in the Bounds Safety RFC
phase and was assured that none of the developers were confused about
name resolutions. I don't know what else to tell you except that
resolving to the struct first is the exact behavior we've been trying
to get correct between GCC and Clang. All arguments for making the
name resolution more explicit have been more-or-less shrugged off. And
even if we adopted the dot-notation (which I'm not against doing), we
would *still* need some form of delayed parsing.

> > The question about
> > whether or not this would cause "confusion" to C programmers isn't
> > completely settled, however Apple says that they have a lot of users
> > and have yet to run into anyone who was confused by it. While just
> > anecdotal evidence, it's a good indicator that people would use the
> > feature "correctly."
>
> I hope I get to see some more information about the context the
> data Apple has.
>
> But the story of my life at the moment is about disagreeing with
> C++ programmers who tell me how C is written but who actually only
> often have some very imited experience writing C.  So I generally
> be sceptical about statements that do not match my experience.
> (I use size expressions in C in prototypes a *lot*, so all this
> talk about how this is error prone simply does not match my
> personal experience.)

If you look back on all of the discussions, I'll see that I agree with
this sentiment. I vastly prefer things to be explicit rather than
implicit.

> > > I think it is generally a challenge to support.  One could certainly
> > > store away the tokens and parse them later (this is certainly doable),
> > > but it adds a lot of issues because you need to add a lot of constraints
> > > for things which should then not be allwoed.  And it is still not an
> > > acceptable solution for size arguments in C.
> > >
> > > .N would work here if you combine with a rule such as ".N" is always
> > > converted to "size_t".   Or you require an explicit cast if is different
> > > to "size_t" .
> >
> > Does this mean that the example above would be treated essentially like:
> >
> >   void func(char *buffer __counted_by((size_t).N * sizeof((size_t).N)), int 
> > N);
> >
> > ?
>
> Nobody has devised a full specification for .N yet, that would support
> arbitrary expressions     There are several possibilities.  Treating it as
> size_t would be one option, but probably not a very good one.
>
> One option which might be attractive is to treat it as int which would
> cover all types that have integer promotion, and require the user to add
> a cast if it comes later and is something else.
>
> One question here is also if you really want an unsigned type when
> you compute a bound, because you might get wraparound.  This makes
> we wonder how this expressions would look like in practice.  Maybe
> you casts anyway.
>
> The original idea was to first support it only for a single
> identifier [.N] and maybe just special expressions such as [.N + 3]
> in which case the type may not be relevant because you want to treat
> this specially anyway.

Yes and no. We started with a single identifier, but the idea of using
expressions more complex than '.N + 3' was always the goal. And, from
my understanding, Clang does support that. All of the work we've been
doing has been to support the expression stuff, and that's really my
main focus for this RFC; specifically expressions within the attribute
used in structs. Whether this RFC could be used for parameters has yet
to be seen (I suspect that it could, but would be more invasive).

-bw

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Reply via email to