Re: Fwd: LLVM collaboration?

Joseph S. Myers Fri, 07 Feb 2014 15:31:39 -0800

On Fri, 7 Feb 2014, Renato Golin wrote:

> For a long time already I've been hearing on the LLVM list people
> saying: "oh, ld should not accept this deprecated instruction, but we
> can't change that", "that would be a good idea, but we need to talk to
> the GCC guys first", and to be honest, nobody ever does.


I think there are other closely related issues, as GCC people try to work 
around issues with glibc, or vice versa, rather than coordinating what 
might be the best solution involving changes to both components, as people 
in the glibc context complain about some Linux kernel decision but have 
difficulty getting any conclusion in conjunction with Linux kernel people 
about the right way forward (or, historically, have difficulty getting 
agreement there is a problem at all - the Linux kernel community has 
tended to have less interest in supporting the details of standards than 
the people now trying to do things in GCC and glibc), as Linux kernel 
people complain about any compiler that optimizes C as a high-level 
language in ways conflicting with its use as something more like a 
portable assembler for kernel code, and as people from the various 
communities complain about issues with underlying standards such as ISO C 
and POSIX but rather less reliably engage with the standards process to 
solve those issues.

Maybe the compiler context is sufficiently separate from the others 
mentioned that there should be multiple collaboration routes for 
(compilers), (libraries / kernel), ... - but people need to be aware that 
just because something is being discussed in a compiler context doesn't 
mean that a C language extension is the right solution; it's possible 
something involving both language and library elements is right, it's 
possible collaboration with the standards process is right at an early 
stage.

(The libraries / kernel collaboration venue exists - the linux-api list, 
which was meant to be for anything about the kernel/userspace interface.  
Unfortunately, it's rarely used - most relevant kernel discussion doesn't 
go there - and I don't have time to follow linux-kernel.  We have recently 
seen several feature requests from the Linux kernel side reported to GCC 
Bugzilla, which is good - at least if there are people on the GCC side 
working on getting such things of use to the kernel implemented in a 
suitably clean way that works for what the kernel wants.)

> 2. There are decisions that NEED to be shared.
> 
> In the past, GCC implemented a lot of extensions because the standards
> weren't good enough. This has changed, but the fact that there will
> always be things that don't belong on any other standard, and are very
> specific to the toolchain inner workings, hasn't.

There are also lots of things where either (a) it would make sense to get 
something in a standard - it can be defined sensibly at the level ISO C / 
C++ deals with, or (b) the standard exists, but what gets implemented 
ignores the standard.  Some of this may be because economic incentives 
seem to get things done one way rather than another way that would 
ultimately be better for users of the languages.

To expand on (a): for a recent GCC patch there was a use for having 
popcount on the host, and I noted 
<http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00305.html> how that's one 
of many integer manipulation operations lacking any form of standard C 
bindings.  Sometimes for these things we do at least have 
target-independent GCC extensions - but sometimes just target-specific 
ones, with multiple targets having built-in functions for similar things, 
nominally mapping to particular instructions, when it would be better to 
have a standard C binding for a given operation.

To expand on (b): consider the recent AVX512 GCC patches.  As is typical 
for patches enabling support for new instruction set features, they added 
a large pile of intrinsics (intrinsic headers, mapping to built-in 
functions).  The intrinsics implement a standard of sorts - shared with 
Intel's compiler, at least.  But are they really the right approach in all 
cases?  The features of AVX512 include, in particular, floating-point 
operations with rounding modes embedded in the instruction (in support of 
IEEE 754-2008 saying language standards should support a way to apply 
rounding modes to particular blocks, not just dynamic rounding modes).

There's a proposed specification for C bindings to such a feature - draft 
TS 18661-1 (WG14 N1778; JTC1 ballot ends 2014-03-05, so may well be 
published later this year).  There was some discussion of this starting 
with <http://gcc.gnu.org/ml/gcc/2012-12/msg00129.html> (discussion 
continued into Jan 2013), which I presume was motivated by the AVX512 
feature, but in the end the traditional intrinsics were the approach taken 
for supporting this feature, not anything that would allow 
architecture-independent source code to be written.  (The AVX512 feature 
combines constant rounding modes with disabling exceptions, which isn't 
ideal, but could still be used for FENV_ROUND with -fno-trapping-math.)

Now, implementation and standards can go in either order.  When you start 
with the implementation, even with a couple of implementations working 
together, there's a risk of some cases ending up as just what the 
implementation accidentally does.  When you start with a specification, 
there's a risk of other corner cases not being considered if they are only 
apparent when you try to implement something - and also that the result of 
a design-by-committee is too removed from user requirements, or otherwise 
impractical.  So it can be helpful to have feedback as features are 
implemented during standardization - but if the features go in mainline of 
the relevant tool / library, then you have compatibility issues when the 
standard changes but people were using earlier versions.  So there isn't 
one clear right answer there.

It can also sometimes be useful to have multiple implementation approaches 
providing input for a standards committee later trying to adopt the best 
from both (consider the current CPLEX work, trying to draw on both OpenMP 
and Cilkplus).  So if you do coordinate on some feature, that doesn't mean 
you need to reach agreed conclusions - it may be perfectly reasonable for 
different preferences to arise in different communities, resulting in 
different features for the same purpose being implemented (hopefully not 
actually using the same syntax for different incompatible things), so user 
experience can help determine what a future standard solution might look 
like.

As for the economic incentives point: given a new CPU feature, there's an 
obvious incentive for the CPU company to make the feature available to 
their users from C.  Doing rather more work to make such a language / 
library feature available on all architectures - but optimized on theirs 
to take advantage of the CPU feature - may be less attractive.  But it's 
probably better long-term for the C users for their programs to be more 
portable, and if there's a standard involved that can mean portability 
among implementations as well as among processor architectures.

So: we could discuss what some feature in C should look like, whether as a 
pure extension, or something to end up as a standards proposal; having 
more people looking at the feature earlier might well help spot potential 
issues.  But until someone actually tries implementing it and people start 
using it, it's quite likely the discussion could go off in directions that 
are not particularly helpful.  And if only one tool implements it, 
feedback from the other may well be of limited value.

I think an important question then is how to get more of the larger pieces 
of work done where there aren't the immediate incentives.  It would be 
useful to get something (say integer manipulation) into ISO C, but there's 
a lot of work and ad hoc extensions already exist in compilers.  I'd like 
to get more of the floating-point features (both C99/C11 Annex F/G, and 
the new draft IEEE 754-2008 bindings) implemented in GCC and glibc (and 
would be doubtful about integrating the new bindings in a future version 
of the C standard without some implementation experience) - again, it's a 
lot of work, both in time and in terms of the range of expertise required 
that would probably need multiple people doing different bits.  I'd like 
more future ISO C features implemented as they enter the draft standard 
rather than afterwards - again, significant work to be done at particular 
times, along with compatibility risks.

One thing that may limit collaboration taking place (at present, and in 
future) is simply the time involved to keep on top of the various 
discussions.  Many, perhaps most, people can barely keep up with the 
discussions for the projects they are actively involved in.  Keeping up 
with GCC, glibc, ISO C, ..., skimming things for POSIX, Linux kernel, 
binutils, GDB, ... takes up a lot of my time, even without implementing 
things for the various projects (predominantly glibc over the past couple 
of years) and other responsibilities.  I'd like to do more in various of 
these areas, to get missing standard / proposed standard features 
implemented in GCC and glibc, to ensure proposed standards and extensions 
are high quality, to get the standards to better address areas in which 
they are weak.  But there are limits to both total time available and time 
I can actually justify spending on these things.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Fwd: LLVM collaboration?

Reply via email to