Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Yeoul Na Mon, 28 Jul 2025 16:52:22 -0700

Could someone working on Linux answer my earlier question? Working on a 
compromise solution is one thing, but I’m trying to understand the situation 
better.


> Out of curiosity, do you think focusing on simple identifier cases (which, as 
> I understand, are the majority in the Linux kernel as well) would allow us to 
> make meaningful progress for now? My assumption is that even such simple use 
> cases (e.g., __counted_by(field) on a pointer field) are yet to be widely 
> adopted across the Linux codebases, but I’d love to hear your perspective.


Thanks,
Yeoul

> On Jul 28, 2025, at 4:36 PM, Yeoul Na <yeoul...@apple.com> wrote:
> 
> 
> 
>> On Jul 28, 2025, at 2:40 PM, Bill Wendling <mo...@google.com> wrote:
>> 
>> On Mon, Jul 28, 2025 at 1:48 PM Qing Zhao <qing.z...@oracle.com 
>> <mailto:qing.z...@oracle.com>> wrote:
>>>> On Jul 28, 2025, at 16:09, Martin Uecker <ma.uec...@gmail.com> wrote:
>>>> Am Montag, dem 28.07.2025 um 11:18 -0700 schrieb Yeoul Na:
>>>>>> On Jul 28, 2025, at 10:27 AM, Qing Zhao <qing.z...@oracle.com> wrote:
>>>>>>> On Jul 26, 2025, at 12:43, Yeoul Na <yeoul...@apple.com> wrote:
>>>>>>>> On Jul 24, 2025, at 3:52 PM, Kees Cook <k...@kernel.org> wrote:
>>>>>>>> 
>>>>>>>> On Thu, Jul 24, 2025 at 04:26:12PM +0000, Aaron Ballman wrote:
>>>>>>>>> Ah, apologies, I wasn't clear. My thinking is: we're (Clang folks)
>>>>>>>>> going to want it to work in C++ mode because of shared headers. If it
>>>>>>>>> works in C++ mode, then we have to figure out what it means with all
>>>>>>>>> the various C++ features that are possible, not just the use cases
>>>>>>>> 
>>>>>>>> I am most familiar with C, so I may be missing something here, but if
>>>>>>>> -fbounds-safety is intended to be C only, then why not just make it
>>>>>>>> unrecognized in C++?
>>>>>>> 
>>>>>>> The bounds safety annotations must also be parsable in C++. While C++ 
>>>>>>> can get bounds checking by using std::span instead of raw pointers, 
>>>>>>> switching to std::span breaks ABI. Therefore,
>>>>>>> in many situations, C++ code must continue to use raw pointers—for 
>>>>>>> example, when interoperating with C code by sharing headers with C. In 
>>>>>>> such cases, bounds annotations can help close
>>>>>>> safety gaps in raw pointers.
>>>>>> 
>>>>>> -fbound-safety feature was initially proposed as an C extension, So, 
>>>>>> it’s natural to make it compatible with C language, not C++.
>>>>>> If C++ also need such a feature, then an extension to C++ is needed too.
>>>>>> If a consistent syntax for this feature can satisfy both C and C++,  
>>>>>> that will be ideal.
>>>>>> However, if  providing such consistent syntax requires major changes to 
>>>>>> C language,
>>>>>> ( a new name lookup scope, and late parsing), it might be a good idea to 
>>>>>> provide different syntax for C and C++.
>>>>> 
>>>>> 
>>>>> So the main problem here is when the "same code” will be parsed in both 
>>>>> in C and C++, which is quite common in practice.
>>>>> 
>>>>> Therefore, we need a way to reasonably write code that works both C and 
>>>>> C++.
>>>>> 
>>>>> From my perspective, that means:
>>>>> 
>>>>> 1. The same spelling doesn’t “silently" behave differently in C and C++.
>>>>> 2. At least the most common use cases (i.e., __counted_by(peer)) should 
>>>>> be able to be written the same way in C and C++, without ceremony.
>>>>> 
>>>>> Here is our compromise proposal that meets these requirements, until we 
>>>>> get blessing from the standard for a more elegant solution:
>>>>> 
>>>>> 1. `__counted_by(member)` keeps working as is: late parsing + name lookup 
>>>>> finds the member name first
>>>>> 2. `__counted_by_expr(expr)` uses a new syntax (e.g., __self), and is not 
>>>>> allowed to use a name that matches the member name without the new syntax 
>>>>> even if that would’ve resolved to a
>>>>> global variable. Use something like  `__global_ref(id)` to disambiguate. 
>>>>> This rule will prevent the confusion where `__counted_by_expr(id)` and 
>>>>> `__counted_by(id)` may designate different
>>>>> entities.
>>>>> 
>>>>> Here are the examples:
>>>>> 
>>>>> Ex 1)
>>>>> constexpr int n = 10;
>>>>> 
>>>>> struct s {
>>>>>  int *__counted_by(n) ptr; // resolves to member `n`; which matches the 
>>>>> current behavior
>>>>>  int n;
>>>>> };
>>>>> 
>>>>> Ex 2)
>>>>> constexpr int n = 10;
>>>>> struct s {
>>>>>  int *__counted_by_expr(n) ptr; // error: referring to a member name 
>>>>> without “__self."
>>>>>  int n;
>>>>> };
>>>>> 
>>>>> Ex 3)
>>>>> constexpr int n = 10;
>>>>> struct s {
>>>>>  int *__counted_by_expr(__self.n) ptr; // resolves to member `n`
>>>>>  int n;
>>>>> };
>>>>> 
>>>>> 
>>>>> Ex 4)
>>>>> constexpr int n = 10;
>>>>> struct s {
>>>>>  int *__counted_by_expr(__self.n + 1) ptr; // resolves to member `n`
>>>>>  int n;
>>>>> };
>>>>> 
>>>>> 
>>>>> Ex 5)
>>>>> constexpr int n = 10;
>>>>> struct s {
>>>>>  int *__counted_by_expr(__global_ref(n) + 1) ptr; // resolves to global 
>>>>> `n`
>>>>>  int n;
>>>>> };
>>>>> 
>>>>> 
>>>>> Ex 6)
>>>>> constexpr int n = 10;
>>>>> struct s {
>>>>>  int *__counted_by_expr(n + 1) ptr; // resolves to global `n`; okay, no 
>>>>> matching member name
>>>>> };
>>>>> 
>>>>> Or in case, people prefer forward declaration inside 
>>>>> `__counted_by_expr()`, the similar rule can apply to achieve the same 
>>>>> goal.
>>>>> 
>>>> 
>>>> Thank you Yeoul!
>>>> 
>>>> I think it is a reasonable compromise.
>> 
>> This was suggested months ago, so sure, seems reasonable. 
> 
> I want to make sure we’re on the same page. Not sure which one you meant by 
> being suggested months ago. This is different from previous proposals from 
> GCC. The difference is when the unqualified name matches any member:
> 
> constexpr int n = 10;
> struct s {
>   int n;
>   int *__counted_by_expr(n + 1) buf; // error; instead of silently picking up 
> global `n` 
> };
> 
> constexpr int n = 10;
> struct s {
>   int *__counted_by_expr(n + 1) buf; // error; instead of silently picking up 
> global `n` 
>   int n;
> };
> 
> This restriction was important to meet the requirement that the same code is 
> not interpreted differently in C and C++, and in the same category of 
> attributes.
> 
> Yeoul
> 
>> How do we
>> handle function calls, like the "byte swap" example Kees pointed out?
>> 
>>> Yes, I agree. -:)
>>> 
>>> It adds two new keywords in both C and C++ (__self and __global_ref) to 
>>> explicitly mark the scopes for the variables inside the attribute.
>>> will definitely resolve the lookup scope ambiguity issue in both C and C++.
>>> 
>>> However, it will not resolve the issue when the counted_by field is 
>>> declared After the pointer field.
>>> So, forward declarations is still  needed to resolve this issue, I think.
>> 
>> I suggest delayed parsing instead.
>> 
>> -bw

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Reply via email to