[CC -= Laurent, since it bounces] On Fri, Jun 20, 2025 at 11:26:55PM +0200, Alejandro Colomar wrote: > Hi! > > After the useful discussion with Eric and Paul, I've rewritten a draft > of a proposal I had for realloc(3) for C2y. Here it is (see below). > > I'll present it here before presenting it to the C Committee (although > several members are CCd). > > This time, I opted for an all-in-one change that puts us in the end > goal, since some people were concerned that step-by-step might be less > feasible. Also, the wording is more consistent doing this at once, and > people know what to expect from the begining. > > > Have a lovely day! > Alex > > --- > Name > alx-0029r1 - Restore the traditional realloc(3) specification > > Principles > - Uphold the character of the language > - Keep the language small and simple > - Facilitate portability > - Avoid ambiguities > - Pay attention to performance > - Codify existing practice to address evident deficiencies. > - Avoid quiet changes > - Enable secure programming > > Category > Remove UB. > > Author > Alejandro Colomar <a...@kernel.org> > > Cc: <bug-gnulib@gnu.org> > Cc: <m...@lists.openwall.com> > Cc: <libc-al...@sourceware.org> > Cc: наб <nabijaczlew...@nabijaczleweli.xyz> > Cc: Douglas McIlroy <douglas.mcil...@dartmouth.edu> > Cc: Paul Eggert <egg...@cs.ucla.edu> > Cc: Robert Seacord <rcseac...@gmail.com> > Cc: Elliott Hughes <e...@google.com> > Cc: Bruno Haible <br...@clisp.org> > Cc: JeanHeyd Meneide <phdoftheho...@gmail.com> > Cc: Rich Felker <dal...@libc.org> > Cc: Adhemerval Zanella Netto <adhemerval.zane...@linaro.org> > Cc: Joseph Myers <josmy...@redhat.com> > Cc: Florian Weimer <fwei...@redhat.com> > Cc: Laurent Bercot <ska-dietl...@skarnet.org> > Cc: Andreas Schwab <sch...@suse.de> > Cc: Thorsten Glaser <t...@mirbsd.de> > Cc: Eric Blake <ebl...@redhat.com> > Cc: Vincent Lefevre <vinc...@vinc17.net> > Cc: Mark Harris <mark....@gmail.com> > Cc: Collin Funk <collin.fu...@gmail.com> > Cc: Wilco Dijkstra <wilco.dijks...@arm.com> > Cc: DJ Delorie <d...@redhat.com> > Cc: Cristian Rodríguez <crist...@rodriguez.im> > Cc: Siddhesh Poyarekar <siddh...@gotplt.org> > Cc: Sam James <s...@gentoo.org> > Cc: Mark Wielaard <m...@klomp.org> > Cc: "Maciej W. Rozycki" <ma...@redhat.com> > Cc: Martin Uecker <ma.uec...@gmail.com> > Cc: Christopher Bazley <chris.bazley.w...@gmail.com> > Cc: <es...@obsession.se> > > History > <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0029.git/> > > r0 (2025-06-17): > - Initial draft. > > r1 (2025-06-20): > - Full rewrite after the recent glibc discussion. > > See also > <https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html> > <https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html> > > <https://inbox.sourceware.org/libc-alpha/20241019014002.3684656-1-siddh...@sourceware.org/T/#u> > > <https://inbox.sourceware.org/libc-alpha/qukfe5yxycbl5v7ooskvqdnm3au3orohbx4babfltegi47iyly@or6dgf7akeqv/T/#u> > > <https://github.com/bminor/glibc/commit/7c2b945e1fd64e0a5a4dbd6ae6592a7314dcd4b5> > <https://www.austingroupbugs.net/view.php?id=400> > <https://www.austingroupbugs.net/view.php?id=526> > <https://www.austingroupbugs.net/view.php?id=688> > <https://sourceware.org/bugzilla/show_bug.cgi?id=12547> > <https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_400.htm> > <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n868.htm> > <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm> > <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf> > > <https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/realloc.html> > > <https://pubs.opengroup.org/onlinepubs/9699919799.2013edition/functions/realloc.html> > > Description > Let's start by quoting the author of realloc(3). > > On 2024-10-18 05:30, Douglas McIlroy wrote: > > The discussion has taken a turn that's astonishing to one who > > doesn't know the inside details of real compilers. > > > > Regardless of the behavior of malloc(0), one expects this > > theorem to hold: > > > > Given that p = malloc(n) is not NULL, > > that 0<=m<=n, > > and that malloc(m) could in some circumstance > > return a non-null pointer, > > then realloc(p,m) will return a non-null pointer. > > > > REALLOC_ZERO_BYTES_FREES flies in the face of this rational > > expectation about dynamic storage allocation. A diabolical > > invention. > > > > Doug > > The specification of realloc(3) has been problematic since the > very first standards, even before ISO C. The wording has > changed significantly, trying to forcedly permit implementations > to return a null pointer when the requested size is zero. This > originated from the intent of banning zero-sized objects from > the language in C89, but that never worked well in > retrospective, as we can see from the fallout. > > None of the specifications have been good, and C23 finally gave > up and made it undefined behavior. > > However, this doesn't need to be like that. The traditional > implementation of realloc(3), present in Unix V7, inherited by > the BSDs, and currently available in range of systems, including > musl libc, doesn't have any issues. > > Code written for platforms returning a null can be migrated to > platforms returning non-null, without significant issues. > > There are two kinds of code that call realloc(p,0). One > hard-codes the 0, and is used as a replacement of free(p). This > code ignores the return value, since it's unimportant. This > code currently produces a leak of 0 bytes plus associated > metadata on platforms such as musl libc, where it returns a > non-null pointer. However, assuming that there are programs > written with the knowledge that they won't ever be run on such > platforms, we should take care of that, and make sure they don't > leak. A way of accomplishing this would be to recommend > implementations to issue a diagnostic when realloc(3) is called > with a hardcoded zero. This is only an informal recommendation > made by this proposal, as this is a matter of QoI, and the > standard shouldn't say anything about it. This would prevent > this class of minor leaks. > > Moreover, in glibc, realloc(p,0) may return non-null, in the > case where p is NULL, so code must already take that into > account, and thus code that simply takes realloc(p,0) as a > synonym of free(p) is already leaky, as free(NULL) is a no-op, > but realloc(NULL,0) allocates 0 bytes. > > The other kind of code is in algorithms that realloc(3) an > arbitrary size, which might eventually be zero. This gets more > complex. > > Here's the code that should be written for AIX or glibc: > > errno = 0; > new = realloc(old, size); > if (new == NULL) { > if (errno == ENOMEM) > free(old); > goto fail; > } > ... > free(new); > > Failing to check for ENOMEM in these platforms before freeing > the old pointer would result in a double-free. If the program > decides to continue using the old pointer instead of freeing it, > it would result in a use-after-free. > > In the platforms where realloc(p,0) returns non-null, such as > the BSDs or musl libc, it is simpler to handle it: > > new = realloc(old, size); > if (new == NULL) { // errno is ENOMEM > free(old); > goto fail; > } > ... > free(new); > > Whenever the result is a null pointer, these platforms are > reporting an ENOMEM error, and thus it is superfluous to check > errno there. > > Most code is written in this way, even if run on platforms > returning a null pointer. This is because most programmers are > just unaware of this problem. > > If the realloc(3) specification was changed to require that > realloc(p,0) returns non-null on success, and that realloc(p,0) > only fails when out-of-memory, and to require that it sets > errno to ENOMEM, then code written for AIX or glibc would > continue working just fine, since the errno check would be > redundant with the null check. Simply, the conditional > (errno == ENOMEM) would always be true when (new == NULL). > > This makes handling of realloc(3) as straightforward as one > would expect, with only two states: success or error. > > The resulting wording in the standard is also much simpler, as > it doesn't need to define so many special cases. > > For consistency, all the other allocation functions are updated > to both return an . > > Prior art > gnulib > gnulib provides the realloc-posix module, which aims to wrap the > system realloc(3) and reallocarray(3) functions so that they > behave in a POSIX-complying manner. > > It previously behaved like glibc. After I reported that it was > non-conforming to POSIX, we discussed the best way forward, > which we agreed was the same direction that this paper is > proposing now for C2y. The implementation was changed in > > gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc > (..., 0) now returns nonnull") > > There have been no regression reports since then, as we > expected. > > Unix V7 > The proposed behavior is the one endorsed by Doug McIlroy, the > author of the original implementation of realloc(3) in Unix V7, > and also present in the BSDs. > > Design decisions > This change needs three changes, which can be applied both at > once, or in two separate steps. > > The first step would make realloc(p,s) be consistent with > free(p) and malloc(s), including when p is a null pointer, when > s is zero, and also when both corner cases happen at the same > time. This change would already turn the implementations where > malloc(0) returns non-null into the end goal we have. > > The first step would require changes to (at least) the following > implementations: glibc, Bionic, Windows. > > The second step would be to require that malloc(0) returns a > non-null pointer. > > The second step would require changes to (at least) the > following implementations: AIX. > > The third step would be to require that on error, errno is set > to ENOMEM. > > This proposal has merged all steps into a single proposal. > > This proposal also needs to add ENOMEM to the standard, since it > hasn't been standardized yet. > > Future directions > This proposal, by specifying realloc(3) as-if by calling > free(3) and malloc(3), makes it redundant several mentions of > realloc(3) next to either free(3) or malloc(3) in the standard. > We could remove them in this proposal, or clean up that in a > separate (mostly editorial) proposal. Let's keep it for a > future proposal for now. > > Caveats > Code written today should be careful, in case it can run on > older systems that are not fixed to comply with this stricter > specification. Thus, code written today should call realloc(3) > similar to this: > > realloc(p, n?n:1); > > When all existing implementations are fixed to comply with this > stricter specification, that workaround can be removed. > > Proposed wording > Based on N3550. > > 7.5 Errors <errno.h> > ## Add ENOMEM in p2. > > 7.25.4.1 Memory management functions :: General > @@ p1 > ... > If the size of the space requested is zero, > -the behavior is implementation-defined: > -either > -a null pointer is returned to indicate the error, > -or > the behavior is as if the size were some nonzero value, > except that the returned pointer shall not be used > to access an object. > > 7.25.4.2 The aligned_alloc function > @@ Returns, p3 > The <b>aligned_alloc</b> function returns > -either > -a null pointer > -or > -a pointer to the allocated space. > +a pointer to the allocated space > +on success. > +If > +the space cannot be allocated, > +a null pointer is returned, > +and the value of the macro <b>ENOMEM</b> > +is stored in <b>errno</b>. > > 7.25.4.3 The calloc function > @@ Returns, p3 > The <b>calloc</b> function returns > -either > a pointer to the allocated space > +on success. > -or a null pointer > -if > +If > the space cannot be allocated > or if the product <tt>nmemb * size</tt> > -would wraparound <b>size_t</b>. > +would wraparound <b>size_t</b>, > +a null pointer is returned, > +and the value of the macro <b>ENOMEM</b> > +is stored in <b>errno</b>. > > 7.25.4.7 The malloc function > @@ Returns, p3 > The <b>malloc</b> function returns > -either > -a null pointer > -or > -a pointer to the allocated space. > +a pointer to the allocated space > +on success. > +If > +the space cannot be allocated, > +a null pointer is returned, > +and the value of the macro <b>ENOMEM</b> > +is stored in <b>errno</b>. > > 7.25.4.8 The realloc function > @@ Description, p2 > The <b>realloc</b> function > deallocates the old object pointed to by <tt>ptr</tt> > +as if by a call to <b>free</b>, > and returns a pointer to a new object > -that has the size specified by <tt>size</tt>. > +that has the size specified by <tt>size</tt> > +as if by a call to <b>malloc</b>. > The contents of the new object > shall be the same as that of the old object prior to deallocation, > up to the lesser of the new and old sizes. > Any bytes in the new object > beyond the size of the old object > have unspecified values. > > @@ p3 > If <tt>ptr</tt> is a null pointer, > the <b>realloc</b> function behaves > like the <b>malloc</b> function for the specified size. > Otherwise, > if <tt>ptr</tt> does not match a pointer > earlier returned by a memory management function, > or > if the space has been deallocated > by a call to the <b>free</b> or <b>realloc</b> function, > -or > -if the size is zero, > ## We're defining the behavior. > the behavior is undefined. > If > -memory for the new object is not allocated, > +the space cannot be allocated, > ## Editorial; for consistency with the wording of the other functions. > the old object is not deallocated > and its value is unchanged. > > @@ Returns, p4 > The <b>realloc</b> function returns > a pointer to the new object > (which can have the same value > -as a pointer to the old object), > +as a pointer to the old object) > +on success. > -or > +If > +space cannot be allocated, > a null pointer > +is returned > +and the value of the macro <b>ENOMEM</b> > +is stored in <b>errno</b>. > > -- > <https://www.alejandro-colomar.es/>
-- <https://www.alejandro-colomar.es/>
signature.asc
Description: PGP signature