> What regex did you use for searching?

I went cheap and easy rather than trying to narrow down:
https://sourcegraph.com/search?q=context:global+lang:C+lengthof&patternType=regexp&sm=0

> I was thinking of renaming the proposal to elementsof(), to avoid confusion 
> between length of an array and length of a string.  Would you mind checking 
> if elementsof() is ok?

From what I was seeing, it looks to be used more uniformly as a function-like 
macro accepting a single argument.

~Aaron

-----Original Message-----
From: Alejandro Colomar <a...@kernel.org> 
Sent: Wednesday, August 14, 2024 8:58 AM
To: Jens Gustedt <jens.gust...@inria.fr>; Ballman, Aaron 
<aaron.ball...@intel.com>
Cc: Xavier Del Campo Romero <xavi....@tutanota.com>; Gcc Patches 
<gcc-patches@gcc.gnu.org>; Daniel Plakosh <dplak...@cert.org>; Martin Uecker 
<uec...@tugraz.at>; Joseph Myers <josmy...@redhat.com>; Gabriel Ravier 
<gabrav...@gmail.com>; Jakub Jelinek <ja...@redhat.com>; Kees Cook 
<keesc...@chromium.org>; Qing Zhao <qing.z...@oracle.com>; David Brown 
<david.br...@hesbynett.no>; Florian Weimer <fwei...@redhat.com>; Andreas Schwab 
<sch...@linux-m68k.org>; Timm Baeder <tbae...@redhat.com>; A. Jiang 
<d...@live.cn>; Eugene Zelenko <eugene.zele...@gmail.com>
Subject: Re: v2.1 Draft for a lengthof paper

Hi Aaron, Jens,

On Wed, Aug 14, 2024 at 02:17:52PM GMT, Jens Gustedt wrote:
> Am 14. August 2024 13:31:19 MESZ schrieb "Ballman, Aaron" 
> <aaron.ball...@intel.com>:
> > Sorry for top-posting, my work account is stuck on Outlook. :-/
> > 
> > > For a WG14 paper you should add these findings to support that choice.
> > > Another option would be for WG14 to standardize the then existing 
> > > implementation with the double underscores.
> > 
> > +1, it's always good to explain prior art and existing uses as part
> > of the paper. However, please also point out that C++ has a prior 
> > art as well which is slightly different and very much worth
> > considering: they have one API for getting the array's rank, and 
> > another for getting a specific rank's extent. This is a general 
> > solution that doesn't require the programmer to have deep knowledge 
> > of C's declarator syntax and how it relates to multidimensional 
> > arrays.

I have added that to my draft.  I'll publish it soon as a reply to the GCC 
mailing list.  See below for details of what I have added for now.

> > 
> > That said, I suspect WG14 would not be keen on standardizing 
> > `lengthof` without an ugly keyword given that there are plenty of other 
> > uses of it that would break:
> > 
> > https://sourcegraph.com/github.com/illumos/illumos-gate/-/blob/usr/s
> > rc/cmd/mailx/names.c?L53-55
> > https://sourcegraph.com/github.com/Rockbox/rockbox/-/blob/tools/ipod
> > _fw.c?L292-294
> > https://sourcegraph.com/github.com/OpenSmalltalk/opensmalltalk-vm/-/
> > blob/src/spur64.stack/validImage.c?L7014-7018
> > (and many, many others)

What regex did you use for searching?

I was thinking of renaming the proposal to elementsof(), to avoid confusion 
between length of an array and length of a string.  Would you mind checking if 
elementsof() is ok?

> > >> > As for the parentheses, I personally think lengthof should 
> > >> > follow similar rules compared to sizeof.
> > >> 
> > >> I think most people agree with this.
> > >
> > > I still don't, in particular not for standardisation.
> > > 
> > > We have to remember that there are many small C compilers out there.
> > 
> > Those compilers already have to handle parsing this for sizeof, so 
> > that's not particularly compelling

Agree.  I suspect it will be simpler for existing compilers to follow sizeof 
than to have new syntax.  However, it's easy to keep it as a QoI detail, so 
I've temporarily changed the wording to require parentheses, and let 
implementations lift that restriction.

> > (even if we wanted to design C
> > for the lowest common denominator of implementation effort, which 
> > I'm not convinced is a good approach these days).

Off-topic, but I wish that had been the approach when a few implementations (I 
suspect proprietary vendors; this was never
disclosed) rejected redefining NULL as the right thing: (void *) 0.

I fixed one of the last free-software implementations of NULL that expanded to 
0, and nullptr would probably never have been added if WG14 had not accepted 
the pressure from such horrible implementations.

<https://github.com/cc65/cc65/issues/1823>

> > That said, if we went with a rank/extent design, I think we'd *have* 
> > to use parens because the extent interface would take two operands 
> > (the array and the rank you're interested in getting the extent of) 
> > and it would be inconsistent for the rank interface to then not 
> > require parens.

   Prior art
     C
            It is common in C programs to get the number of elements of
            an array via the usual sizeof division and  wrap  it  in  a
            macro.  Common names include:

            •  ARRAY_SIZE()
            •  NELEM()
            •  NELEMS()
            •  NITEMS()
            •  NELTS()
            •  elementsof()
            •  lengthof()

     C++
            In  C++,  there  are several standard features to determine
            the number of elements of an array:

            std::size()   (since C++17)
            std::ssize()  (since C++20)
                   The syntax of these is  identical  to  the  usual  C
                   macros named above.

                   It’s  a  bit different, since it’s a general purpose
                   sizing template, which works on non‐array types too,
                   with different semantics.

                   But when applied to an array, it has the same seman‐
                   tics as the macros above.

            std::extent  (since C++23)
                   The syntax of this is quite different.   It  uses  a
                   numeric index as a second parameter to determine the
                   dimension  in which the number of elements should be
                   counted.

                   C arrays are much simpler than C++’s many array‐like
                   types, and I don’t see a reason why  we  would  need
                   something  as  complex  as  std::extent  in C.  Cer‐
                   tainly, existing projects have not developed such  a
                   macro, even if it is technically possible:

                       #define DEREFERENCE(a, n) DEREFERENCE_ ## n (a, c)
                       #define DEREFERENCE_9(a)  (*********(a))
                       #define DEREFERENCE_8(a)  (********(a))
                       #define DEREFERENCE_7(a)  (*******(a))
                       #define DEREFERENCE_6(a)  (******(a))
                       #define DEREFERENCE_5(a)  (*****(a))
                       #define DEREFERENCE_4(a)  (****(a))
                       #define DEREFERENCE_3(a)  (***(a))
                       #define DEREFERENCE_2(a)  (**(a))
                       #define DEREFERENCE_1(a)  (*(a))
                       #define DEREFERENCE_0(a)  ((a))
                       #define extent(a, n)      nitems(DEREFERENCE(a, n))

                   If any project needs that syntax, they can implement
                   their  own  trivial  wrapper  macro, as demonstrated
                   above.

            Existing prior art in C seems to favour a design that  fol‐
            lows the syntax of other operators like sizeof.

> I think that this argument goes too short. E. g. implementation that 
> already have compound expressions (or lambdas ;-) may provide a 
> quality implementation using `static_assert` and `typeof` alone, and 
> don't have to touch their compiler at all.
> 
> We should not impose an implementation in the language where doing it 
> in a header can be completely sufficient.

I have concerns about a libc (or a predefined macro) implementation:
the sizeof division causes double evaluation with any VLAs, while my 
implementation for GCC has less cases of evaluation, and when it needs to 
evaluate, it only does it once.  It would be hard to find a good wording that 
would allow an implementation to implement this as a macro.

   constexpr
     The  usual  sizeof division evaluates the operand and results in a
     run‐time value in cases where it wouldn’t be  necessary.   If  the
     top‐level  array  number  of  elements is determined by an integer
     constant expression, but an internal array is a VLA,  sizeof  must
     evaluate:

            int  a[7][n];
            int  (*p)[7][n];

            p = &a;
            nitems(*p++);

     With  a  elementsof operator, this would result in an integer con‐
     stant expression of value 7.

   Double evaluation
     With the sizeof‐based implementation from above, the example  from
     above causes double evaluation of *p++.

> Plus, implementing as a macro in a header (probably <stddef.h>) makes 
> also a feature test, for those applications that already have 
> something similar.

This is interesting.  But I think an implementation could just

        #define lengthof lengthof

to provide a feature-test macro.

> this was basically what we did for `unreachable` and I think it worked 
> out fine.
> 
> Jens

Have a lovely day!
Alex

--
<https://www.alejandro-colomar.es/>

Reply via email to