> Am 26.10.2023 um 12:14 schrieb Martin Uecker <uec...@tugraz.at>:
>
> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker <uec...@tugraz.at> wrote:
>>>>
>>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>>>
>>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker <uec...@tugraz.at>:
>>>>>>
>>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>>>
>>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker <uec...@tugraz.at>:
>>>>>>>>>>
>>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 +0000 schrieb Qing Zhao:
>>>>>>>>>>> Hi, Sid,
>>>>>>>>>>>
>>>>>>>>>>> Really appreciate for your example and detailed explanation. Very
>>>>>>>>>>> helpful.
>>>>>>>>>>> I think that this example is an excellent example to show (almost)
>>>>>>>>>>> all the issues we need to consider.
>>>>>>>>>>>
>>>>>>>>>>> I slightly modified this example to make it to be compilable and
>>>>>>>>>>> run-able, as following:
>>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE happening,
>>>>>>>>>>> anyway, the potential reordering possibility is there…)
>>>>>>>>>>>
>>>>>>>>>>> 1 #include <malloc.h>
>>>>>>>>>>> 2 struct A
>>>>>>>>>>> 3 {
>>>>>>>>>>> 4 size_t size;
>>>>>>>>>>> 5 char buf[] __attribute__((counted_by(size)));
>>>>>>>>>>> 6 };
>>>>>>>>>>> 7
>>>>>>>>>>> 8 static size_t
>>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>>> 10 {
>>>>>>>>>>> 11 return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>>> 12 }
>>>>>>>>>>> 13
>>>>>>>>>>> 14 void
>>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>>> 16 {
>>>>>>>>>>> 17 struct A *obj = __builtin_malloc (sizeof(struct A) + sz *
>>>>>>>>>>> sizeof(char));
>>>>>>>>>>> 18 obj->size = sz;
>>>>>>>>>>> 19 obj->buf[0] = 2;
>>>>>>>>>>> 20 __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>>> 21 return;
>>>>>>>>>>> 22 }
>>>>>>>>>>> 23
>>>>>>>>>>> 24 int main ()
>>>>>>>>>>> 25 {
>>>>>>>>>>> 26 foo (20);
>>>>>>>>>>> 27 return 0;
>>>>>>>>>>> 28 }
>>>>>>>>>>>
>>>>>>>
>>>>>>> <snip>
>>>>>>>
>>>>>>>>> When it’s set I suppose. Turn
>>>>>>>>>
>>>>>>>>> X.l = n;
>>>>>>>>>
>>>>>>>>> Into
>>>>>>>>>
>>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>>>
>>>>>>>> It would turn
>>>>>>>>
>>>>>>>> some_variable = (&) x.buf
>>>>>>>>
>>>>>>>> into
>>>>>>>>
>>>>>>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>>>>>>>
>>>>>>>>
>>>>>>>> So the later access to x.buf and not the initialization
>>>>>>>> of a member of the struct (which is too early).
>>>>>>>>
>>>>>>>
>>>>>>> Hmm, so with Qing's example above, are you suggesting the transformation
>>>>>>> be to foo like so:
>>>>>>>
>>>>>>> 14 void
>>>>>>> 15 foo (size_t sz)
>>>>>>> 16 {
>>>>>>> 16.5 void * _1;
>>>>>>> 17 struct A *obj = __builtin_malloc (sizeof(struct A) + sz *
>>>>>>> sizeof(char));
>>>>>>> 18 obj->size = sz;
>>>>>>> 19 obj->buf[0] = 2;
>>>>>>> 19.5 _1 = __builtin_with_size (obj->buf, obj->size);
>>>>>>> 20 __builtin_printf (“%d\n", get_size_from (_1));
>>>>>>> 21 return;
>>>>>>> 22 }
>>>>>>>
>>>>>>> If yes then this could indeed work. I think I got thrown off by the
>>>>>>> reference to __bdos.
>>>>>>
>>>>>> Yes. I think it is important not to evaluate the size at the
>>>>>> access to buf and not the allocation, because the point is to
>>>>>> recover it from the size member even when the compiler can't
>>>>>> see the original allocation.
>>>>>
>>>>> But if the access is through a pointer without the attribute visible
>>>>> even the Frontend cannot recover?
>>>>
>>>> Yes, if the access is using a struct-with-FAM without the attribute
>>>> the FE would not be insert the builtin. BDOS could potentially
>>>> still see the original allocation but if it doesn't, then there is
>>>> no information.
>>>>
>>>>> We’d need to force type correctness and give up on indirecting
>>>>> through an int * when it can refer to two diffenent container types.
>>>>> The best we can do I think is mark allocation sites and hope for
>>>>> some basic code hygiene (not clobbering size or array pointer
>>>>> through pointers without the appropriately attributed type)
>>>>
>>>> I am do not fully understand what you are referring to.
>>>
>>> struct A { int n; int data[n]; };
>>> struct B { long n; int data[n]; };
>>>
>>> int *p = flag ? a->data : b->data;
>>>
>>> access *p;
>>>
>>> Since we need to allow interoperability of pointers (a->data is
>>> convertible to a non-fat pointer of type int *) this leaves us with
>>> ambiguity we need to conservatively handle to avoid false positives.
>>
>> For BDOS, I would expect this to work exactly like:
>>
>> char aa[n1];
>> char bb[n2];
>> char *p = flag ? aa : bb;
>>
>> (or similar code with malloc). In fact it does:
>>
>> https://godbolt.org/z/bK68YKqhe
>> (cheating a bit and also the sub-object version of
>> BDOS does not seem to work)
>>
>>>
>>> We _might_ want to diagnose decay of a->data to int *, but IIRC
>>> there's no way (or proposal) to allow declaring a corresponding
>>> fat pointer, so it's not a good designed feature.
>>
>> As a language feature, I fully agree. I see the
>> counted_by attribute has a makeshift solution.
>>
>> But we can already do:
>>
>> auto p = flag ? &aa : &bb;
>>
>> and this already works perfectly:
>>
>> https://godbolt.org/z/rvb6xWWPj
>>
>> We can also name the variably-modifed type:
>>
>> char (*p)[flag ? n1 : n2] = flag ? &aa : &bb;
>> https://godbolt.org/z/13cTT1vGP
>>
>> The problem with this version is that consistency
>> is not checked. (I have patch for adding run-time
>> checks).
>>
>> And then the next step would be to allow
>>
>> char (*p)[:] = flag ? &aa : &bb;
>>
>> or similar. Dennis Ritchie proposed this himself
>> a long time ago.
>>
>> So far this seems straightfoward.
>>
>> If we then want to allow such wide pointers as
>> function arguments or in structs, we would need
>> to define an ABI. But the ABI could just be
>>
>> struct { char (*p)[.s]; size_t s; };
>>
>> Maybe we could try to make the following
>> ABI compatible:
>>
>> int foo(int p[s], size_t s);
>> int foo(int p[:]);
>>
>>
>>> Having __builtin_with_size at allocation would possibly make
>>> the BOS use-def walk discover both objects.
>>
>> Yes. But I do not think this there is any fundamental
>> difference to discovering allocation functions.
>>
>>> I think you can't
>>> insert __builtin_with_size at the access to *p, but in practice
>>> that would be very much needed.
>>
>> Usually the access to *p would follow directly the
>> access x.buf, so BDOS should find it.
>>
>> But yes, to get full bounds safety, the pointer type
>> has to change to a variably-modified type (which would work
>> today) or a fat pointer type. The later can be built on
>> vm-types easily because all the FE semantics already
>> exists.
>
> We could insert the __builtin_with_size everywhere
> we have to convert a wide pointer or let an array
> decay to traditional pointer for reason of compatibility
> with legacy code.
That sounds like a nice idea. Note I’d like to see the consumer side
implemented so we can play with different points of insertion (and I’ll try to
show corner cases where it goes wrong). It all seems a bit late for GCC 14
though.
Richard
> Martin
>
>>
>> Martin
>>
>>>
>>> Richard.
>>>
>>>> But yes,
>>>> for full bounds safety we would need the language feature.
>>>> In C people should start to variably-modified types
>>>> more. I think we can build perfect bounds safety on top of
>>>> them in a very good way with only FE changes.
>>>>
>>>> All these attributes are just a best effort. But for a while,
>>>> this will be necessary.
>>>>
>>>> Martin
>>>>
>>>>>
>>>>>> Evaluating at this point requires that the size is correctly set
>>>>>> before the access to the FAM and the user has to make sure
>>>>>> this is the case. But to me this requirement would make sense.
>>>>>>
>>>>>> Semantically, it could aöso make sense to evaluate the size at a
>>>>>> later time. But then the reordering becomes problematic again.
>>>>>>
>>>>>> Also I think this would make this feature generally more useful.
>>>>>> For example, it could work also for others pointers in the struct
>>>>>> and not just for FAMs. In this case, the struct may already be
>>>>>> freed when BDOS is called, so it might also not possible to
>>>>>> access the size member at a later time.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
>