Update on SVE/sizeless types for C and C++

Richard Sandiford Tue, 12 Nov 2019 08:06:55 -0800

Last year I posted a series of patches that added the concept of
"sizeless" types to the C and C++ frontends, in order to support
the SVE vector and predicate types:


  https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00868.html

That thread generated a lot of useful discussion and feedback, thanks.
This message is a summary of where things stand, with the later part
of the message dealing specifically with some of the points raised
during that thread and on the WG14 reflector.

Before getting to that though, I just wanted to emphasise that the
sizeless type extension isn't needed to enable frontends to compile
correct SVE ACLE code.  It was/is only about disallowing invalid code.
In fact this was supposed to be one of its main selling points: the
frontends can just pass the types through using their natural (native)
representation, and all the type system needs to do is prevent uses that
would make that representation problematic.  (See "Things in favour of (1)"
in the message above for more details.)

To help make this point (perhaps too forcibly), I committed the target
side of the SVE ACLE support a couple of weeks ago.  This means that the
C++ frontend can (as far as I know) already compile correct ACLE code,
even in the default length-agnostic mode.  The C frontend just needs a
simple one-line change to do the same, since currently it rejects valid
variable-length initialisers.  After that, all we're missing is the
diagnostics for invalid code.

When I started implementing sizeless types in GCC, I thought about
doing it in two ways: directly in the frontends, or via target hooks.
The latter approach was inspired by existing target hooks like
TARGET_INVALID_CONVERSION.

Implementing them directly in the frontend seemed best at first, since
it would also allow us to support user-defined sizeless aggregates.
E.g. we could support a "sizeless struct" that can contain any mixture
of sized and sizeless fields.  (Again, this would be about giving
diagnostics for invalid code.  The frontends could treat valid uses
of sizeless structs in just the same way as normal structs.)

However, we've had significant pushback on the idea of user-defined
sizeless aggregates and (on the clang and LLVM side) variable-length
aggregates in general, so I'm happy to drop sizeless structs for now.
The only sizeless types we need to support are therefore the built-in
SVE ones.

That being the case, using a target hook is less invasive than teaching
the frontends about sizeless types.  It arguably gives better error
messages too, since the target can talk specifically about SVE types.

That's what the new implementation does.  I'll post it to gcc-patches
shortly.

If the use of sizeless types does expand beyond SVE built-in types
in future, the places that call the hook are the places that would
need to deal directly with sizeless types.

As mentioned above, I also wanted to summarise where things stand with
the original sizeless type discussion, even though it's hopefully moot
for now.  The rest of this message therefore summarises the status of
the following topics from the original thread:

- Other proposals on the WG14 reflector
- Using a variable sizeof for C
- Using a variable sizeof for C and a wrapper class for C++

Other proposals on the WG14 reflector
-------------------------------------

When I posted the message last year, Joseph asked how sizeless types
would interact with the bignum type that Martin Uecker had proposed
on the WG14 reflector.  At the time I raised this on the reflector,
it seemed like bignum was still in the relatively early stages and that
the semantics hadn't yet been nailed down, but we talked about three
general possibilities:

  (1) a single fixed-size type that refers to separately-allocated storage.
      Properties:
      - "bignum *" is a valid type
      - "sizeof(bignum)" is constant
      - there are (at least) three approaches to storage management:

        - The separate storage is always on the stack, with stack
          deallocation as garbage collection.  bignum objects are only
          guaranteed to live as long as the function call that creates them
          (or that last assigned to them).

        - The separate storage can be on the stack or heap, with something
          like C++ constructor, destructor, assignment and move semantics
          to manage lifetimes.

        - The types use dynamic garbage-collection (which I don't think was
          specifically discussed on-list at the time).

  (2) a single self-contained variable-size type that can hold any value.
      (i.e. all bignum objects still have the same type).  Properties:
      - "bignum *" is a valid type
      - "sizeof(bignum)" is invalid (because the size depends on the value)
      - the size of a bignum type with given properties can be measured
        by a library function but not by sizeof
      - the size of a bignum object can be measured by a library function
        but not by sizeof (since sizeof shouldn't access the object)
      - it needs an extension like the sizeless type one; it doesn't
        fit the existing VLA model, which creates multiple types

  (3) multiple self-contained variable-length types (e.g. one for each
      bit width), with "bignum" being an auto-like keyword that selects
      the appropriate type for a given value.  Properties:
      - "bignum *" is an invalid type; you would need to pick one of the
        underlying types instead
      - "sizeof(bignum)" is invalid; you would need to pick one of the
        underlying types instead
      - the size of an underlying type with given properties could be
        measured by a library function or by sizeof
      - the size of a bignum object could be measured by a library function
        or by sizeof
      - it fits the existing VLA model

(2) is the most similar to what we want for SVE and should fit the sizeless
type model quite well.  (1) isn't a good fit for SVE because the types
should be self-contained, even at function boundaries.  (3) isn't a good
fit for SVE because there's only ever one valid vector type for a given
element type, and the size is determined by the environment rather than
the program itself.

See also:

   https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01092.html

for a summary of other reflector proposals at around the same time.

Using a variable sizeof for C
-----------------------------

I think a lot of the resistance so far to sizeless types has been
that disallowing "sizeof" just isn't C (or C++).  And since C already
supports variable sizeof, the argument is that we might as well just
make sizeof return the (runtime) size of the SVE vector.

One theoretical objection to that is that for:

    svint8_t *ptr;

the size of the object at *ptr is whatever size svint8_t had when the
object was created.  So if:

    sizeof (*ptr)

returns the current size of svint8_t, the value might not be accurate.
Accessing *ptr would be undefined in those cases (at least if *ptr is
smaller than svint8_t is now).  But to me it seems stranger for sizeof
to be "wrong" for a correctly-typed object than for it not to be defined
at all.  I don't think this situation could happen for VLAs.

Another point is that C ensures sizeof(T) is consistent wherever it
appears.  It does this by evaluating the size of variable-length types
at particular points and then carrying that size forward.  Thus the
source language is effectively dictating what the size is and when it
should be evaluated.  In contrast, the size of SVE types is dictated by
the target.  Fixing the value of sizeof at particular evaluation points
would be an artificial construction.

However, my main objection to using variable sizeof for SVE types
is that we can't do that for C++.  Which brings us to...

Using a variable sizeof for C and a wrapper class for C++
---------------------------------------------------------

Joseph suggested that we could get around the C++ problem by turning
the SVE types into C++ classes that have a fixed size and conceptually
refer to separate storage for the vector contents.  We could then map
these types to the ABI vector types during code generation.  This would
be similar to how decimal floats have different representations in C
and C++ but map to the same underlying ABI type.

I think this would be very difficult to do in practice though.
The C++ classes would end up being similar to std::vector, and would
need to have similar memory management.  We'd then need to ensure that
there are no memory leaks when the C++ class is folded to the ABI type
on returning from a function.  We'd also need to ensure that memory is
successfully reallocated when the caller converts the ABI type back
into the C++ class when receiving the returned value.

There's also the problem that sizeof then becomes different between
C and C++, which isn't true for decimal floats.

This kind of encapsulation also isn't necessary for a correct
implementation.  The C++ frontend already copes with correct SVE code,
letting the built-in vector types pass through without modification.
So I think it would be better to avoid the overhead of a wrapper class
if we can.

Thanks,
Richard

Update on SVE/sizeless types for C and C++

Reply via email to