Am Donnerstag, dem 26.10.2023 um 16:41 +0000 schrieb Qing Zhao: > > > On Oct 26, 2023, at 5:20 AM, Martin Uecker <uec...@tugraz.at> wrote: > > > > Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener: > > > On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker <uec...@tugraz.at> wrote: > > > > > > > > Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener: > > > > > > > > > > > Am 25.10.2023 um 12:47 schrieb Martin Uecker <uec...@tugraz.at>: > > > > > > > > > > > > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh > > > > > > Poyarekar: > > > > > > > > On 2023-10-25 04:16, Martin Uecker wrote: > > > > > > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard > > > > > > > > Biener: > > > > > > > > > > > > > > > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker > > > > > > > > > > <uec...@tugraz.at>: > > > > > > > > > > > > > > > > > > > > Am Dienstag, dem 24.10.2023 um 20:30 +0000 schrieb Qing > > > > > > > > > > Zhao: > > > > > > > > > > > Hi, Sid, > > > > > > > > > > > > > > > > > > > > > > Really appreciate for your example and detailed > > > > > > > > > > > explanation. Very helpful. > > > > > > > > > > > I think that this example is an excellent example to show > > > > > > > > > > > (almost) all the issues we need to consider. > > > > > > > > > > > > > > > > > > > > > > I slightly modified this example to make it to be > > > > > > > > > > > compilable and run-able, as following: > > > > > > > > > > > (but I still cannot make the incorrect reordering or DSE > > > > > > > > > > > happening, anyway, the potential reordering possibility > > > > > > > > > > > is there…) > > > > > > > > > > > > > > > > > > > > > > 1 #include <malloc.h> > > > > > > > > > > > 2 struct A > > > > > > > > > > > 3 { > > > > > > > > > > > 4 size_t size; > > > > > > > > > > > 5 char buf[] __attribute__((counted_by(size))); > > > > > > > > > > > 6 }; > > > > > > > > > > > 7 > > > > > > > > > > > 8 static size_t > > > > > > > > > > > 9 get_size_from (void *ptr) > > > > > > > > > > > 10 { > > > > > > > > > > > 11 return __builtin_dynamic_object_size (ptr, 1); > > > > > > > > > > > 12 } > > > > > > > > > > > 13 > > > > > > > > > > > 14 void > > > > > > > > > > > 15 foo (size_t sz) > > > > > > > > > > > 16 { > > > > > > > > > > > 17 struct A *obj = __builtin_malloc (sizeof(struct A) + > > > > > > > > > > > sz * sizeof(char)); > > > > > > > > > > > 18 obj->size = sz; > > > > > > > > > > > 19 obj->buf[0] = 2; > > > > > > > > > > > 20 __builtin_printf (“%d\n", get_size_from (obj->buf)); > > > > > > > > > > > 21 return; > > > > > > > > > > > 22 } > > > > > > > > > > > 23 > > > > > > > > > > > 24 int main () > > > > > > > > > > > 25 { > > > > > > > > > > > 26 foo (20); > > > > > > > > > > > 27 return 0; > > > > > > > > > > > 28 } > > > > > > > > > > > > > > > > > > > > > > > > > <snip> > > > > > > > > > > > > > > > > When it’s set I suppose. Turn > > > > > > > > > > > > > > > > > > X.l = n; > > > > > > > > > > > > > > > > > > Into > > > > > > > > > > > > > > > > > > X.l = __builtin_with_size (x.buf, n); > > > > > > > > > > > > > > > > It would turn > > > > > > > > > > > > > > > > some_variable = (&) x.buf > > > > > > > > > > > > > > > > into > > > > > > > > > > > > > > > > some_variable = __builtin_with_size ( (&) x.buf. x.len) > > > > > > > > > > > > > > > > > > > > > > > > So the later access to x.buf and not the initialization > > > > > > > > of a member of the struct (which is too early). > > > > > > > > > > > > > > > > > > > > > > Hmm, so with Qing's example above, are you suggesting the > > > > > > > transformation > > > > > > > be to foo like so: > > > > > > > > > > > > > > 14 void > > > > > > > 15 foo (size_t sz) > > > > > > > 16 { > > > > > > > 16.5 void * _1; > > > > > > > 17 struct A *obj = __builtin_malloc (sizeof(struct A) + sz * > > > > > > > sizeof(char)); > > > > > > > 18 obj->size = sz; > > > > > > > 19 obj->buf[0] = 2; > > > > > > > 19.5 _1 = __builtin_with_size (obj->buf, obj->size); > > > > > > > 20 __builtin_printf (“%d\n", get_size_from (_1)); > > > > > > > 21 return; > > > > > > > 22 } > > > > > > > > > > > > > > If yes then this could indeed work. I think I got thrown off by > > > > > > > the > > > > > > > reference to __bdos. > > > > > > > > > > > > Yes. I think it is important not to evaluate the size at the > > > > > > access to buf and not the allocation, because the point is to > > > > > > recover it from the size member even when the compiler can't > > > > > > see the original allocation. > > > > > > > > > > But if the access is through a pointer without the attribute visible > > > > > even the Frontend cannot recover? > > > > > > > > Yes, if the access is using a struct-with-FAM without the attribute > > > > the FE would not be insert the builtin. BDOS could potentially > > > > still see the original allocation but if it doesn't, then there is > > > > no information. > > > > > > > > > We’d need to force type correctness and give up on indirecting > > > > > through an int * when it can refer to two diffenent container types. > > > > > The best we can do I think is mark allocation sites and hope for > > > > > some basic code hygiene (not clobbering size or array pointer > > > > > through pointers without the appropriately attributed type) > > > > > > > > I am do not fully understand what you are referring to. > > > > > > struct A { int n; int data[n]; }; > > > struct B { long n; int data[n]; }; > > > > > > int *p = flag ? a->data : b->data; > > > > > > access *p; > > > > > > Since we need to allow interoperability of pointers (a->data is > > > convertible to a non-fat pointer of type int *) this leaves us with > > > ambiguity we need to conservatively handle to avoid false positives. > > > > For BDOS, I would expect this to work exactly like: > > > > char aa[n1]; > > char bb[n2]; > > char *p = flag ? aa : bb; > > > > (or similar code with malloc). In fact it does: > > > > https://godbolt.org/z/bK68YKqhe > > (cheating a bit and also the sub-object version of > > BDOS does not seem to work) > > > > > > > > We _might_ want to diagnose decay of a->data to int *, but IIRC > > > there's no way (or proposal) to allow declaring a corresponding > > > fat pointer, so it's not a good designed feature. > > > > As a language feature, I fully agree. I see the > > counted_by attribute has a makeshift solution. > > The “counted_by” attribute is necessary at this moment since > it will be much easier to be adopted by the existing source code, > for example, the Linux Kernel.
Yes, this is understood. > > Though I agree that embedding the bound information into TYPE > system should be the ultimate goal. > > > > > But we can already do: > > > > auto p = flag ? &aa : &bb; > > > > and this already works perfectly: > > > > https://godbolt.org/z/rvb6xWWPj > > > > We can also name the variably-modifed type: > > > > char (*p)[flag ? n1 : n2] = flag ? &aa : &bb; > > https://godbolt.org/z/13cTT1vGP > > > > The problem with this version is that consistency > > is not checked. (I have patch for adding run-time > > checks). > > > > And then the next step would be to allow > > > > char (*p)[:] = flag ? &aa : &bb; > > > > or similar. Dennis Ritchie proposed this himself > > a long time ago. > > > > So far this seems straightfoward. > > > > If we then want to allow such wide pointers as > > function arguments or in structs, we would need > > to define an ABI. But the ABI could just be > > > > struct { char (*p)[.s]; size_t s; }; > > > > Maybe we could try to make the following > > ABI compatible: > > > > int foo(int p[s], size_t s); > > int foo(int p[:]); > > > > > > > Having __builtin_with_size at allocation would possibly make > > > the BOS use-def walk discover both objects. > > > > Yes. But I do not think this there is any fundamental > > difference to discovering allocation functions. > > > > > I think you can't > > > insert __builtin_with_size at the access to *p, but in practice > > > that would be very much needed. > > > > Usually the access to *p would follow directly the > > access x.buf, so BDOS should find it. > > > > But yes, to get full bounds safety, the pointer type > > has to change to a variably-modified type (which would work > > today) or a fat pointer type. > > By variable-modified type, you mean the VLA? I mean a pointer to a VLA type. > > There is one major difference between VLA and (FAM or Pointer array): > > For VLA, the compiler is responsible for allocating the memory for it, > the size assignment and the memory allocation are both done by the > compiler at the same time and tied together. A VLA can also exist on the heap: char (*buf)[n] = malloc(sizeof(*buf)); > > But for FAM and pointer arrays, right now, users allocate the memory for them > In the source code, so, when we add the “counted_by” attribute, we need to > specify the additional requirement for the order of size assignment and memory > allocation into the source code, and specify this requirement in the user > documentation. > > Later, if we try to make the bound information of FAM/pointer array into TYPE > system, similar as the current VLA, should we also need to move the memory > allocation > of the FAM/pointer arrays into compiler (similar as VLA too)? I think memory allocation can be done either as an automatic variable or by malloc. The following works today in GNU C: int N = ..; struct foo { char buf[N]; } x; struct foo *p = malloc(sizeof(struct foo)); The only limitation today is that the size 'n' can not refer to the field member. struct foo { int n; char buf[.n]; }; I am not yet sure how we would set the size for an automatic object, but I have some ideas. Maybe simply using an initializer: struct foo x = { .n = 10 }; Martin > > The later can be built on > > vm-types easily because all the FE semantics already > > exists. > > Except the memory allocation part… > > Do I miss anything here? > > Qing > > > > Martin > > > > > > > > Richard. > > > > > > > But yes, > > > > for full bounds safety we would need the language feature. > > > > In C people should start to variably-modified types > > > > more. I think we can build perfect bounds safety on top of > > > > them in a very good way with only FE changes. > > > > > > > > All these attributes are just a best effort. But for a while, > > > > this will be necessary. > > > > > > > > Martin > > > > > > > > > > > > > > > Evaluating at this point requires that the size is correctly set > > > > > > before the access to the FAM and the user has to make sure > > > > > > this is the case. But to me this requirement would make sense. > > > > > > > > > > > > Semantically, it could aöso make sense to evaluate the size at a > > > > > > later time. But then the reordering becomes problematic again. > > > > > > > > > > > > Also I think this would make this feature generally more useful. > > > > > > For example, it could work also for others pointers in the struct > > > > > > and not just for FAMs. In this case, the struct may already be > > > > > > freed when BDOS is called, so it might also not possible to > > > > > > access the size member at a later time. > > > > > > > > > > > > Martin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >