On Tue, 10 May 2022 17:01:26 PDT (-0700), Palmer Dabbelt wrote:
[Sorry for cross-posting to a bunch of lists, I figured it'd be best to
have all the discussions in one thread.]
We currently only support what is defined by official RISC-V
specifications in the various GNU toolchain projects. There's certainly
some grey areas there, but in general that means not taking code that
relies on drafts or vendor defined extensions, even if that would result
in higher performance or more featured systems for users.
The original goal of these policies were to steer RISC-V implementers
towards a common set of specifications, but over the last year or so
it's become abundantly clear that this is causing more harm that good.
All extant RISC-V systems rely on behaviors defined outside the official
specifications, and while that's technically always been the case we've
gotten to the point where trying to ignore that fact is impacting real
users on real systems. There's been consistent feedback from users that
we're not meeting their needs, which can clearly be seen in the many out
of tree patch sets in common use.
There's been a handful of discussions about this, but we've yet to have
a proper discussion on the mailing lists. From the various discussions
I've had it seems that folks are broadly in favor of supporting vendor
extensions, but the devil's always in the details with this sort of
thing so I thought it'd be best to write something up so we can have a
concrete discussion.
The idea is to start taking code that depends on vendor-defined behavior
into the core GNU toolchain ports, as long as it meets the following
criteria:
* An ISA manual is available that can be redistributed/archived, defines
the behaviors in question as one or more vendor-specific extensions,
and is clearly versioned. The RISC-V foundation is setting various
guidelines around how vendor-defined extensions and instructions
should be named, we strongly suggest that vendors follow those
conventions whenever possible (this is all new, though, so exactly
what's necessary from vendor specifications will likely evolve as we
learn).
* There is a substantial user base that depends on the behavior in
question, which probably means there is hardware in the wild that
implements the extensions and users that require those extensions in
order for that hardware to be useful for common applications. This is
always going to be a grey area, but it's essentially the same spot
everyone else is in.
* There is a mechanism for testing the code in question without direct
access to hardware, which in practice means a QEMU port (or whatever
simulator is relevant in the space and that folks use for testing) or
some community commitment to long-term availability of the hardware
for testing (something like the GCC compile farm, for example).
* It is possible to produce binaries that are compatible with all
upstream vendors' implementations. That means we'll need mechanisms
to allow extensions from multiple vendors to be linked together and
then probed at runtime. That's not to say that all binaries will be
compatible, as users are always free to skip the compatibility code
and there will be conflicting definitions of instruction encodings,
but we can at least provide users with the option of compatibility.
These are pretty loosely written on purpose, both because this is all
new and because each project has its own set of contribution
requirements so it's going to be all but impossible to have a single
concrete set of rules that applies everywhere -- that's nothing specific
to the vendor extensions (or even RISC-V), it's just life. Specifically
a major goal here is to balance the needs of users, both in the short
term (ie, getting new hardware to work) and the long term (ie, the long
term stability of their software). We're not talking about taking code
that can't be tested, hasn't been reviewed, isn't going to be supported
long-term, or doesn't have a stable ABI; just dropping the specific
requirement that a specification must be furnished by the RISC-V
foundation in order to accept code.
Nothing is decided yet, so happy to hear any thought folks have. This
is certainly a very different development methodology than what we've
done in the past and isn't something that should be entreated into
lightly, so any comments are welcome.
I'm going back to the start of the thread as this led to some heated
discussion, both here and in private. Clearly there's lots of opinions
here and everyone wants something different, but the nature of
compromise is that nobody gets exactly what they want and it looks like
this is as good as we're going to get any time soon. So I'm going to
propose that we go with this.
This was all purposefully a bit vague so we'll have to go sort out exactly
how to move forward as patches go by. Hopefully we'll be able to have
more constructive discussions on the specific patch sets, as at least
the issues will be a bit more focused. The C906 is a widely available
chip that needs vendor extensions to function in some very basic ways,
it's been blocked on a policy change for way too long and at least this
way we can get moving on that front.
Happy to continue the discussion if anyone has concrete concerns here,
either way let's give it at least a few days to get through everyone's
inbox before doing anything that depends on the policy change.