> On Oct 27, 2023, at 3:21 AM, Martin Uecker <uec...@tugraz.at> wrote: > > Am Donnerstag, dem 26.10.2023 um 19:57 +0000 schrieb Qing Zhao: >> I guess that what Kees wanted, ""fill the array without knowing the actual >> final size" code pattern”, as following: >> >>>> struct foo *f; >>>> char *p; >>>> int i; >>>> >>>> f = alloc(maximum_possible); >>>> f->count = 0; >>>> p = f->buf; >>>> >>>> for (i; data_is_available() && i < maximum_possible; i++) { >>>> f->count ++; >>>> p[i] = next_data_item(); >>>> } >> >> actually is a dynamic array, or more accurately, Bounded-size dynamic array: >> ( but not a dynamic allocated array as we discussed so far) >> >> https://en.wikipedia.org/wiki/Dynamic_array >> >> This dynamic array, also is called growable array, or resizable array, whose >> size can >> be changed during the lifetime. >> >> For VLA or FAM, I believe that they are both dynamic allocated array, i.e, >> even though the size is not know at the compilation time, but the size >> will be fixed after the array is allocated. >> >> I am not sure whether C has support to such Dynamic array? Or whether it’s >> easy to provide dynamic array support in C? > > It is possible to support dynamic arrays in C even with > good checking, but not safely using the pattern above > where you derive a pointer which you later use independently. > > While we could track the connection to the original struct, > the necessary synchronization between the counter and the > access to the buffer is difficult. I do not see how this > could be supported with reasonable effort and cost. > > > But with this restriction in mind, we can do a lot in C. > For example, see my experimental (!) container library > which has vector type. > https://github.com/uecker/noplate/blob/main/test.c > You can get an array view for the vector (which then > also can decay to a pointer), so it interoperates nicely > with C but you can get good bounds checking. > > > But once you derive a pointer and pass it on, it gets > difficult. But if you want safety, you just have to > to simply avoid this in code.
So, for the following modified code: (without the additional pointer “p”) struct foo { size_t count; char buf[] __attribute__((counted_by(count))); }; struct foo *f; int i; f = alloc(maximum_possible); f->count = 0; for (i; data_is_available() && i < maximum_possible; i++) { f->count ++; f->buf[i] = next_data_item(); } The support for dynamic array should be possible? > > What we could potentially do is add restrictions so > that the access to buf always has to go via x->buf > or you get at least a warning. Are the following two restrictions to the user enough: 1. The access to buf should always go via x->buf, no assignment to another independent pointer and access buf through this new pointer. 2. User need to keep the synchronization between the counter and the access to the buffer all the time. Qing > > Martin > > > > >> >> Qing >> >> >>> On Oct 26, 2023, at 12:45 PM, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook: >>>> On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote: >>>>> but not this: >>>>> >>> >>> x->count = 11; >>>>> char *p = &x->buf; >>>>> x->count = 1; >>>>> p[10] = 1; // ! >>>> >>>> This seems fine to me -- it's how I'd expect it to work: "10" is beyond >>>> "1". >>> >>> Note that the store would be allowed. >>> >>>> >>>>> (because the pointer is passed around the >>>>> store to the counter) >>>>> >>>>> and also here the second store is then irrelevant >>>>> for the access: >>>>> >>>>> x->count = 10; >>>>> char* p = &x->buf; >>>>> ... >>>>> x->count = 1; // somewhere else >>>>> ---- >>>>> p[9] = 1; // ok, because count matter when buf was accesssed. >>>> >>>> This is less great, but I can understand why it happens. "p" loses the >>>> association with "x". It'd be nice if "p" had to way to retain that it >>>> was just an alias for x->buf, so future p access would check count. >>> >>> The problem is not to discover that p is an alias to x->buf, >>> but that it seems difficult to make sure that stores to >>> x->count are not reordered relative to the final access to >>> p[i] you want to check, so that you then get the right value. >>> >>>> >>>> But this appears to be an existing limitation in other areas where an >>>> assignment will cause the loss of object association. (I've run into >>>> this before.) It's just more surprising in the above example because in >>>> the past the loss of association would cause __bdos() to revert back to >>>> "SIZE_MAX" results ("I don't know the size") rather than an "outdated" >>>> size, which may get us into unexpected places... >>>> >>>>> IMHO this makes sense also from the user side and >>>>> are the desirable semantics we discussed before. >>>>> >>>>> But can you take a look at this? >>>>> >>>>> >>>>> This should simulate it fairly well: >>>>> https://godbolt.org/z/xq89aM7Gr >>>>> >>>>> (the call to the noinline function would go away, >>>>> but not necessarily its impact on optimization) >>>> >>>> Yeah, this example should be a very rare situation: a leaf function is >>>> changing the characteristics of the struct but returning a buffer within >>>> it to the caller. The more likely glitch would be from: >>>> >>>> int main() >>>> { >>>> struct foo *f = foo_alloc(7); >>>> char *p = FAM_ACCESS(f, size, buf); >>>> >>>> printf("%ld\n", __builtin_dynamic_object_size(p, 0)); >>>> test1(f); // or just "f->count = 10;" no function call needed >>>> printf("%ld\n", __builtin_dynamic_object_size(p, 0)); >>>> >>>> return 0; >>>> } >>>> >>>> which reports: >>>> 7 >>>> 7 >>>> >>>> instead of: >>>> 7 >>>> 10 >>>> >>>> This kind of "get an alias" situation is pretty common in the kernel >>>> as a way to have a convenient "handle" to the array. In the case of a >>>> "fill the array without knowing the actual final size" code pattern, >>>> things would immediately break: >>>> >>>> struct foo *f; >>>> char *p; >>>> int i; >>>> >>>> f = alloc(maximum_possible); >>>> f->count = 0; >>>> p = f->buf; >>>> >>>> for (i; data_is_available() && i < maximum_possible; i++) { >>>> f->count ++; >>>> p[i] = next_data_item(); >>>> } >>>> >>>> Now perhaps the problem here is that "count" cannot be used for a count >>>> of "logically valid members in the array" but must always be a count of >>>> "allocated member space in the array", which I guess is tolerable, but >>>> isn't ideal -- I'd like to catch logic bugs in addition to allocation >>>> bugs, but the latter is certainly much more important to catch. >>> >>> Maybe we could have a warning when f->buf is not directly >>> accessed. >>> >>> Martin >>> >>>> >>> >> >