Hi Kees,

> On Mar 7, 2025, at 1:38 PM, Kees Cook <k...@kernel.org> wrote:
> 
> On Thu, Mar 06, 2025 at 04:27:49PM -0800, Yeoul Na wrote:
>> Thanks for writing up the RFC and keeping us in the loop. Are
>> you planning to add “__self.” to GCC's C++ compiler as well in
> 
> Isn't this strictly a C feature? [1]

No, while -fbounds-safety itself is currently implemented for C only, the 
bounds annotations will be used in C++, in order to secure boundaries between C 
and C++. This means C++ should be able to parse the bounds annotations and 
understand it. We already use it internally for that purpose and Devin recently 
presented the vision as part his presentation in LLVM Memory Safety Working 
Group. Therefore, compatibility with C++ is our hard requirement and having to 
always write “__self.” would be a problem. To be clear I don’t oppose to 
introducing “__self" and using it as a suppression mechanism for warnings on 
name conflicts, but I opposed to requiring “__self” always.

> 
>> the future? The problem we have with “__self” being a default
>> way of annotating bounds is that C++ compatibility because bounds
>> annotations are supposed to work in headers shared between C and C++
>> and C++ should be able to parse it to secure the boundary between the
> 
> But this is just a header macro definition issue, not a language issue.
> There's plenty of C-only stuff in headers, but it's trivial to avoid
> parsing it in C++:
> 
> #ifndef __cplusplus
> # define __counted_by(x)      __attribute__((__counted_by__(x)))
> #else
> # define __counted_by(x)
> #endif
> 
> Linux is actually already doing a form of this so we can use counted_by
> in UAPI header files.
> 
>> two languages. Another problem is the usability. The user will have to
>> write more code “__self.” all the time in the most common use cases,
>> which would be a huge regression for the usability of the language.
> 
> I think it's an overstatement to consider it a "huge regression" when
> this is a new feature. Also, again, this is trivially solved with
> macros. I'm fully expecting to make the "__self" transition in Linux by
> just adding a new macro for the expression-capable counted_by and
> adjusting the existing macro:
> 
> #define __counted_by(x)      __attribute__((__counted_by__(__self.(x))))
> #define __counted_by_expr(x) __attribute__((__counted_by__(x)))
> 
> This will immediately work for all users of the existing feature in
> Linux.

IMHO, once we require “__self”, hiding it under a macro like this will create 
an extra confusion to the users because that will still teach the user the 
version of the programming model that did not self. Maybe more confusing if 
some macros are defined to hide it and some macros are not. This sounds like we 
introduce “__self" only to hide it anyway. I think we need diagnostics and 
provide suppression mechanisms (e.g., __self., __builtin_global_ref()), but 
using them only when it is necessary. 

> 
> Now, I suppose you might be trying to take into account the private
> use of differing non-public implementations. :) In that case, yeah, I
> could see there would be some work to do. But it should still be pretty
> easy to write a regex to do all the replacements on such a code base,
> and then build the results to find any globals that got included:
> 
> perl -pi.old -e 's{\b__counted_by\((.*)\)([^\)]*)$}{
>               my $x = $1;
>               my $end = $2;
>               $x =~ s/([a-zA-Z_][a-zA-Z0-9_\.]*)/__self.$1/g;
>                "counted_by($x)$end";
>       }ex;'
> 

Another thing we should consider is that Clang already had a “guarded_by” 
attribute that can refer to a member of the same structure with an unqualified 
name both in C and C++, so we should be careful not to break the existing users 
at least not in a sneaky way.


>>> On Mar 6, 2025, at 2:03 PM, Yeoul Na <yeoul...@apple.com> wrote:
>>> 
>>> + John & Félix & Patryk & Henrik
>>> 
>>>> On Mar 6, 2025, at 1:44 PM, Qing Zhao <qing.z...@oracle.com> wrote:
>>>> [...]
>>>> This code is bad. The size expression "n+10" of the VLA "a" follows the 
>>>> default 
>>>> scoping rule of C, as a result, "n" refers to the local variable "n" that 
>>>> is defined 
>>>> outside of the structure "foo"; However, the argument "n" of the 
>>>> counted_by 
>>>> attribute of the flexible array member b[] follows the new scoping rule, 
>>>> it refers 
>>>> to the member variable "n" inside this structure.   
>>>> 
>>>> It's clear that the current design of the counted_by argument introduced a 
>>>> new 
>>>> scoping rule into C, resulting an inconsistent scoping resolution 
>>>> situation in 
>>>> C language.
>>>> 
>>>> This is a design mistake, and should be fixed.
>> 
>> We will have a different proposal based on reporting diagnostics on the
>> name conflicts. We need to diagnose the name conflicts like above anyway
>> because in code like that almost always the struct contains a buffer
>> and its size as the fields. Given that the program’s intention would
>> be more likely to pick up the member `n`, instead of some random global
>> happened to be with the same name in the same translation unit. Therefore,
>> we should diagnose such cases to avoid mistakes and avoid the program
>> silently working with an unintended way with the user mistake.
> 
> I'm all for better diagnostics, but since C doesn't have a way specify
> scope for a named variable, I don't see how such a diagnostic would
> be actionable.

Right, when we provide diagnostics, there should be suppression mechanisms 
(like __self, __builtin_global_ref), but that doesn’t have to be the default. 
Another action could be to change the names. If a global variable has a name 
like `n` conflicting with a struct field name `n`, then it’s not a good 
variable naming practice anyway.

> 
> int nr;
> struct foo {
>       int nr;
>       u8 array[] __counted_by(nr);
> };
> 
> With this syntax (which would be using the "instance scope" that doesn't
> exist in C), there is no way in C to refer to the global "nr". Therefore,
> these expressions must be have an object reference, which is what __self
> provides. Now "__self.nr" and "nr" are distinct.
> 
> If, however, you're saying that member names used in counted_by cannot
> alias with any globals, then I think that is overly restrictive.
> 
>> Also, this program will have a different meaning in C++, so that’s another
>> reason to always diagnose with such ambiguity.
> 
> But this feature is C only, so there is no "C++ meaning".
> 
>> Also, the bounds annotation
>> user might have just forgotten to add “__self.”
> 
> I don't have a solution for this risk, so from that perspective, it does
> appear that member names used in counted_by must not alias global names.
> But I think that is a new scope, with very weird semantics. It means
> that if a global appears in any header that happens to match the member
> name used in counted_by, the compilation fails. This seems risky too.
> 
> But I think that such a construction results in a completely new scope
> (a namespace where all globals are merged into a given struct's member
> namespace) and creates a very fragile state. With __self, we are
> exclusively using global scope with a special "global" named __self.
> (It cannot be local scope because then the meaning of the counted_by
> expression would be expected to change if there were any local variables
> aliasing global variables, which is even more wild.)
> 
>> because it’s so
>> intuitive to use the member name inside the attributes (I know what’s
>> “intuitive" depends on people’s background, but that’s what we
>> observed from massive adoption experience within Apple). This leaves
>> the feature error-prone, because the most intuitive syntax for bounds
>> annotations will be compiled into a different meaning (using the global
>> as the size instead of the peer member). So we should really diagnose
>> it even if we add “__self" to avoid the mistake.
> 
> So, I think your earlier observation is correct about the "common" case
> for counted_by: it is just a single member name and nothing else: no
> expression, no globals, etc etc. I think the answer here is exactly what
> I suggested above, making it a macro issue, and leaving the feature
> itself to require __self. Then authors can easily deal with the common
> case not needing __self, but complex expressions can be constructed
> without them being ambiguous:
> 
> #define __counted_by(x)      __attribute__((__counted_by__(__self.(x))))
> #define __counted_by_expr(x) __attribute__((__counted_by__(x)))
> 
> struct easy {
>  int length;
>  u8 bytes[] __counted_by(length);
> };
> 
> struct complex {
>  u64 long packet_info;
>  u32 flags;
>  u16 whole_size;
>  u8 bytes[] __counted_by_expr(__self.whole_size - sizeof(struct complex));
> };
> 
> u32 nr_cpus;
> struct sys_info {
>  unsigned long details;
>  struct cpu_info cpu[] __counted_by_expr(nr_cpus);
> };
> 
>> Now, if we always diagnose it, then the lookup order doesn’t really
>> matter anymore. That means we will have an option to keep the current
>> lookup rule of C, and pick up the member name only when the global name
>> is not available (just one possible option). I see “__self.” being
>> used as a suppression mechanism if the programmer cannot change the name
>> of the conflicting global or member. But that doesn’t mean “__self”
>> should be a default way of writing the code. Suppression mechanisms are
>> typically only used to suppress the warnings and disambiguate. And this
>> would mean we also need a way to disambiguate it to mean global. C++
>> already has `::` but C doesn’t currently have a scope qualifier but
>> in order to use this new bounds safety feature, we may need to invent
>> something. Adding a new syntax is a risk so until we standardize it I
>> would suggest something like `__builtin_global_ref()`
> 
> I'm not opposed to adding something like __builtin_global_ref(), but I
> don't think it really solves the confusion related to the existing
> scope resolution, as Qing showed with a VLA that uses local scope.
> 
> void boo (int k)
> {
>  const int n = 10; // a local variable n
>  struct foo {
>    int n;          // a member variable n
>    int a[n + 10];  // for VLA, this n refers to the local variable n.
>    char b[] __counted_by(n); // aaagh
>  };
>  ...
> }

Exactly. This can be quite confusing, so we should diagnose it so the 
programmer can pick better variable names, especially in this case, this is a 
local variable, so the users can change the names. For globals, yes, again, we 
need some suppression mechanisms.

> 
> However, my "hide __self in the macro" solution doesn't make the above
> any less weird to read, so perhaps this isn't as useful an argument as
> I'd like. :)
> 
>> [...]
>> We have a way to not change the meaning of the existing code without
>> introducing a new syntax, but diagnosing already error-prone code that
>> should apply to both VLAs and bounds annotations. We are planning to
>> write up a proposal to the C standard soon.
> 
> What does "soon" mean? Can you send it out next week?

We’ll try to send it out next week. I’m trying to incorporate the new comments 
from this RFC threads. 

> 
> It sounds like you're proposing adding "instance scope", which merges an
> object's member names with the global scope variable names. And to access
> global scope in the face of an aliased name, __builtin_global_ref(NAME)
> would be needed. (Such a builtin would be useful in other areas too,
> e.g. local scope to access an aliased name...)
> 
> It does seem like this solves all the same problems, but I still don't
> like how fragile it is in the face of adding a global that might alias a
> used member name. When this happens in local scope, it's limited to the
> given function, and change any behaviors. For instance scope it means
> that suddenly no globals can be added that alias with used member names.

We already cannot introduce a new global name if that conflicts with another 
global name, and since that’s new code we can always pick another name (and it 
would likely be a better name, i.e., `n` or `length` is not a good global 
name). But yes I agree, if we start diagnosing (warnings) on conflicts with 
struct field names too then it could become much more noisy than diagnosing on 
globals only, and we might need some experiments here. Luckily, Clang already 
started adding diagnostics on such name conflicts, so I hope this will give us 
some good experimental data.

> 
> And on the GCC side, all "instance scope" usage would just have parsing
> delayed until the end of the struct parisng.
> 
> Thanks for the discussion!
> 
> -Kees
> 
> [1] https://clang.llvm.org/docs/BoundsSafety.html
> 
> -- 
> Kees Cook

Thanks for the discussion!
Yeoul

Reply via email to