[CC -= Laurent, since it bounces]

On Fri, Jun 20, 2025 at 11:26:55PM +0200, Alejandro Colomar wrote:
> Hi!
> 
> After the useful discussion with Eric and Paul, I've rewritten a draft
> of a proposal I had for realloc(3) for C2y.  Here it is (see below).
> 
> I'll present it here before presenting it to the C Committee (although
> several members are CCd).
> 
> This time, I opted for an all-in-one change that puts us in the end
> goal, since some people were concerned that step-by-step might be less
> feasible.  Also, the wording is more consistent doing this at once, and
> people know what to expect from the begining.
> 
> 
> Have a lovely day!
> Alex
> 
> ---
> Name
>       alx-0029r1 - Restore the traditional realloc(3) specification
> 
> Principles
>       -  Uphold the character of the language
>       -  Keep the language small and simple
>       -  Facilitate portability
>       -  Avoid ambiguities
>       -  Pay attention to performance
>       -  Codify existing practice to address evident deficiencies.
>       -  Avoid quiet changes
>       -  Enable secure programming
> 
> Category
>       Remove UB.
> 
> Author
>       Alejandro Colomar <a...@kernel.org>
> 
>       Cc: <bug-gnulib@gnu.org>
>       Cc: <m...@lists.openwall.com>
>       Cc: <libc-al...@sourceware.org>
>       Cc: наб <nabijaczlew...@nabijaczleweli.xyz>
>       Cc: Douglas McIlroy <douglas.mcil...@dartmouth.edu>
>       Cc: Paul Eggert <egg...@cs.ucla.edu>
>       Cc: Robert Seacord <rcseac...@gmail.com>
>       Cc: Elliott Hughes <e...@google.com>
>       Cc: Bruno Haible <br...@clisp.org>
>       Cc: JeanHeyd Meneide <phdoftheho...@gmail.com>
>       Cc: Rich Felker <dal...@libc.org>
>       Cc: Adhemerval Zanella Netto <adhemerval.zane...@linaro.org>
>       Cc: Joseph Myers <josmy...@redhat.com>
>       Cc: Florian Weimer <fwei...@redhat.com>
>       Cc: Laurent Bercot <ska-dietl...@skarnet.org>
>       Cc: Andreas Schwab <sch...@suse.de>
>       Cc: Thorsten Glaser <t...@mirbsd.de>
>       Cc: Eric Blake <ebl...@redhat.com>
>       Cc: Vincent Lefevre <vinc...@vinc17.net>
>       Cc: Mark Harris <mark....@gmail.com>
>       Cc: Collin Funk <collin.fu...@gmail.com>
>       Cc: Wilco Dijkstra <wilco.dijks...@arm.com>
>       Cc: DJ Delorie <d...@redhat.com>
>       Cc: Cristian Rodríguez <crist...@rodriguez.im>
>       Cc: Siddhesh Poyarekar <siddh...@gotplt.org>
>       Cc: Sam James <s...@gentoo.org>
>       Cc: Mark Wielaard <m...@klomp.org>
>       Cc: "Maciej W. Rozycki" <ma...@redhat.com>
>       Cc: Martin Uecker <ma.uec...@gmail.com>
>       Cc: Christopher Bazley <chris.bazley.w...@gmail.com>
>       Cc: <es...@obsession.se>
> 
> History
>       <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0029.git/>
> 
>       r0 (2025-06-17):
>       -  Initial draft.
> 
>       r1 (2025-06-20):
>       -  Full rewrite after the recent glibc discussion.
> 
> See also
>       <https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html>
>       <https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html>
>       
> <https://inbox.sourceware.org/libc-alpha/20241019014002.3684656-1-siddh...@sourceware.org/T/#u>
>       
> <https://inbox.sourceware.org/libc-alpha/qukfe5yxycbl5v7ooskvqdnm3au3orohbx4babfltegi47iyly@or6dgf7akeqv/T/#u>
>       
> <https://github.com/bminor/glibc/commit/7c2b945e1fd64e0a5a4dbd6ae6592a7314dcd4b5>
>       <https://www.austingroupbugs.net/view.php?id=400>
>       <https://www.austingroupbugs.net/view.php?id=526>
>       <https://www.austingroupbugs.net/view.php?id=688>
>       <https://sourceware.org/bugzilla/show_bug.cgi?id=12547>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_400.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n868.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>
>       
> <https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/realloc.html>
>       
> <https://pubs.opengroup.org/onlinepubs/9699919799.2013edition/functions/realloc.html>
> 
> Description
>       Let's start by quoting the author of realloc(3).
> 
>       On 2024-10-18 05:30, Douglas McIlroy wrote:
>       > The discussion has taken a turn that's astonishing to one who
>       > doesn't know the inside details of real compilers.
>       >
>       > Regardless of the behavior of malloc(0), one expects this
>       > theorem to hold:
>       >
>       >       Given that p = malloc(n) is not NULL,
>       >       that 0<=m<=n,
>       >       and that malloc(m) could in some circumstance
>       >       return a non-null pointer,
>       >       then realloc(p,m) will return a non-null pointer.
>       >
>       > REALLOC_ZERO_BYTES_FREES flies in the face of this rational
>       > expectation about dynamic storage allocation.  A diabolical
>       > invention.
>       >
>       > Doug
> 
>       The specification of realloc(3) has been problematic since the
>       very first standards, even before ISO C.  The wording has
>       changed significantly, trying to forcedly permit implementations
>       to return a null pointer when the requested size is zero.  This
>       originated from the intent of banning zero-sized objects from
>       the language in C89, but that never worked well in
>       retrospective, as we can see from the fallout.
> 
>       None of the specifications have been good, and C23 finally gave
>       up and made it undefined behavior.
> 
>       However, this doesn't need to be like that.  The traditional
>       implementation of realloc(3), present in Unix V7, inherited by
>       the BSDs, and currently available in range of systems, including
>       musl libc, doesn't have any issues.
> 
>       Code written for platforms returning a null can be migrated to
>       platforms returning non-null, without significant issues.
> 
>       There are two kinds of code that call realloc(p,0).  One
>       hard-codes the 0, and is used as a replacement of free(p).  This
>       code ignores the return value, since it's unimportant.  This
>       code currently produces a leak of 0 bytes plus associated
>       metadata on platforms such as musl libc, where it returns a
>       non-null pointer.  However, assuming that there are programs
>       written with the knowledge that they won't ever be run on such
>       platforms, we should take care of that, and make sure they don't
>       leak.  A way of accomplishing this would be to recommend
>       implementations to issue a diagnostic when realloc(3) is called
>       with a hardcoded zero.  This is only an informal recommendation
>       made by this proposal, as this is a matter of QoI, and the
>       standard shouldn't say anything about it.  This would prevent
>       this class of minor leaks.
> 
>       Moreover, in glibc, realloc(p,0) may return non-null, in the
>       case where p is NULL, so code must already take that into
>       account, and thus code that simply takes realloc(p,0) as a
>       synonym of free(p) is already leaky, as free(NULL) is a no-op,
>       but realloc(NULL,0) allocates 0 bytes.
> 
>       The other kind of code is in algorithms that realloc(3) an
>       arbitrary size, which might eventually be zero.  This gets more
>       complex.
> 
>       Here's the code that should be written for AIX or glibc:
> 
>               errno = 0;
>               new = realloc(old, size);
>               if (new == NULL) {
>                       if (errno == ENOMEM)
>                               free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
> 
>       Failing to check for ENOMEM in these platforms before freeing
>       the old pointer would result in a double-free.  If the program
>       decides to continue using the old pointer instead of freeing it,
>       it would result in a use-after-free.
> 
>       In the platforms where realloc(p,0) returns non-null, such as
>       the BSDs or musl libc, it is simpler to handle it:
> 
>               new = realloc(old, size);
>               if (new == NULL) {  // errno is ENOMEM
>                       free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
> 
>       Whenever the result is a null pointer, these platforms are
>       reporting an ENOMEM error, and thus it is superfluous to check
>       errno there.
> 
>       Most code is written in this way, even if run on platforms
>       returning a null pointer.  This is because most programmers are
>       just unaware of this problem.
> 
>       If the realloc(3) specification was changed to require that
>       realloc(p,0) returns non-null on success, and that realloc(p,0)
>       only fails when out-of-memory, and to require that it sets
>       errno to ENOMEM, then code written for AIX or glibc would
>       continue working just fine, since the errno check would be
>       redundant with the null check.  Simply, the conditional
>       (errno == ENOMEM) would always be true when (new == NULL).
> 
>       This makes handling of realloc(3) as straightforward as one
>       would expect, with only two states: success or error.
> 
>       The resulting wording in the standard is also much simpler, as
>       it doesn't need to define so many special cases.
> 
>       For consistency, all the other allocation functions are updated
>       to both return an .
> 
> Prior art
>     gnulib
>       gnulib provides the realloc-posix module, which aims to wrap the
>       system realloc(3) and reallocarray(3) functions so that they
>       behave in a POSIX-complying manner.
> 
>       It previously behaved like glibc.  After I reported that it was
>       non-conforming to POSIX, we discussed the best way forward,
>       which we agreed was the same direction that this paper is
>       proposing now for C2y.  The implementation was changed in
> 
>               gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc 
> (..., 0) now returns nonnull")
> 
>       There have been no regression reports since then, as we
>       expected.
> 
>     Unix V7
>       The proposed behavior is the one endorsed by Doug McIlroy, the
>       author of the original implementation of realloc(3) in Unix V7,
>       and also present in the BSDs.
> 
> Design decisions
>       This change needs three changes, which can be applied both at
>       once, or in two separate steps.
> 
>       The first step would make realloc(p,s) be consistent with
>       free(p) and malloc(s), including when p is a null pointer, when
>       s is zero, and also when both corner cases happen at the same
>       time.  This change would already turn the implementations where
>       malloc(0) returns non-null into the end goal we have.
> 
>       The first step would require changes to (at least) the following
>       implementations: glibc, Bionic, Windows.
> 
>       The second step would be to require that malloc(0) returns a
>       non-null pointer.
> 
>       The second step would require changes to (at least) the
>       following implementations: AIX.
> 
>       The third step would be to require that on error, errno is set
>       to ENOMEM.
> 
>       This proposal has merged all steps into a single proposal.
> 
>       This proposal also needs to add ENOMEM to the standard, since it
>       hasn't been standardized yet.
> 
> Future directions
>       This proposal, by specifying realloc(3) as-if by calling
>       free(3) and malloc(3), makes it redundant several mentions of
>       realloc(3) next to either free(3) or malloc(3) in the standard.
>       We could remove them in this proposal, or clean up that in a
>       separate (mostly editorial) proposal.  Let's keep it for a
>       future proposal for now.
> 
> Caveats
>       Code written today should be careful, in case it can run on
>       older systems that are not fixed to comply with this stricter
>       specification.  Thus, code written today should call realloc(3)
>       similar to this:
> 
>               realloc(p, n?n:1);
> 
>       When all existing implementations are fixed to comply with this
>       stricter specification, that workaround can be removed.
> 
> Proposed wording
>       Based on N3550.
> 
>     7.5  Errors <errno.h>
>       ## Add ENOMEM in p2.
> 
>     7.25.4.1  Memory management functions :: General
>       @@ p1
>       ...
>        If the size of the space requested is zero,
>       -the behavior is implementation-defined:
>       -either
>       -a null pointer is returned to indicate the error,
>       -or
>        the behavior is as if the size were some nonzero value,
>        except that the returned pointer shall not be used
>        to access an object.
> 
>     7.25.4.2  The aligned_alloc function
>       @@ Returns, p3
>        The <b>aligned_alloc</b> function returns
>       -either
>       -a null pointer
>       -or
>       -a pointer to the allocated space.
>       +a pointer to the allocated space
>       +on success.
>       +If
>       +the space cannot be allocated,
>       +a null pointer is returned,
>       +and the value of the macro <b>ENOMEM</b>
>       +is stored in <b>errno</b>.
> 
>     7.25.4.3  The calloc function
>       @@ Returns, p3
>        The <b>calloc</b> function returns
>       -either
>        a pointer to the allocated space
>       +on success.
>       -or a null pointer
>       -if
>       +If
>        the space cannot be allocated
>        or if the product <tt>nmemb * size</tt>
>       -would wraparound <b>size_t</b>.
>       +would wraparound <b>size_t</b>,
>       +a null pointer is returned,
>       +and the value of the macro <b>ENOMEM</b>
>       +is stored in <b>errno</b>.
> 
>     7.25.4.7  The malloc function
>       @@ Returns, p3
>        The <b>malloc</b> function returns
>       -either
>       -a null pointer
>       -or
>       -a pointer to the allocated space.
>       +a pointer to the allocated space
>       +on success.
>       +If
>       +the space cannot be allocated,
>       +a null pointer is returned,
>       +and the value of the macro <b>ENOMEM</b>
>       +is stored in <b>errno</b>.
> 
>     7.25.4.8  The realloc function
>       @@ Description, p2
>        The <b>realloc</b> function
>        deallocates the old object pointed to by <tt>ptr</tt>
>       +as if by a call to <b>free</b>,
>        and returns a pointer to a new object
>       -that has the size specified by <tt>size</tt>.
>       +that has the size specified by <tt>size</tt>
>       +as if by a call to <b>malloc</b>.
>        The contents of the new object
>        shall be the same as that of the old object prior to deallocation,
>        up to the lesser of the new and old sizes.
>        Any bytes in the new object
>        beyond the size of the old object
>        have unspecified values.
> 
>       @@ p3
>        If <tt>ptr</tt> is a null pointer,
>        the <b>realloc</b> function behaves
>        like the <b>malloc</b> function for the specified size.
>        Otherwise,
>        if <tt>ptr</tt> does not match a pointer
>        earlier returned by a memory management function,
>        or
>        if the space has been deallocated
>        by a call to the <b>free</b> or <b>realloc</b> function,
>       -or
>       -if the size is zero,
>       ## We're defining the behavior.
>        the behavior is undefined.
>        If
>       -memory for the new object is not allocated,
>       +the space cannot be allocated,
>       ## Editorial; for consistency with the wording of the other functions.
>        the old object is not deallocated
>        and its value is unchanged.
> 
>       @@ Returns, p4
>        The <b>realloc</b> function returns
>        a pointer to the new object
>        (which can have the same value
>       -as a pointer to the old object),
>       +as a pointer to the old object)
>       +on success.
>       -or
>       +If
>       +space cannot be allocated,
>        a null pointer
>       +is returned
>       +and the value of the macro <b>ENOMEM</b>
>       +is stored in <b>errno</b>.
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature

Reply via email to