Robert Dewar wrote:
About the time Clang does because GCC now has to compete."
How about that? Clang is currently slightly ahead and GCC really needs
to change if it is to continue to be the best.
Best is measured by many metrics, and it is unrealistic to expect
any product to be best in all respects.
Anyway, it still comes down to figuring out how to find the resources.
Not clear that there is commercial interest in rapid implementation
of c++11, we certainly have not heard of any such interest, and in the
absence of such commercial interest, we do indeed come down to hoping
to find the volunteer help that is needed.
Some months ago, I had a small talk with a professor from a local
university, department for computer science. Some of the research is
about (auto) vectorization and I was curious what he says about compilers.
He said that most students are interested in "cool" topics like
networking, artificial intelligence, robotics, image recognition, etc.
and that compiler technology is regarded as "uncool", "boring".
He also said that the few students that start on compilers, typically
give up with GCC after several weeks or even months with frustration
because they don't manage to find a start or understand the general
structure or are blocked by nasty details from a completely other part
of the compiler that they are not aware of or don't understand and
nobody would explain to them.
He said that the students decide to work on LLVM because it would take
about 1 week or so until they can add their own small extension, find
many examples, good and friendly responsiveness in the mailing lists,
find comprehensible source documentation, appreciate the better
modularity and structure, things of that kind.
When I started with GCC, something was unclear to me and I asked a
question in the gcc lists. The answer was basically:
"From your question it is obvious that you don't understand anything.
Hire a contract GCC developer to implement what you want. You will
never contribute anything useful to GCC."
My volunteering for GCC is of private nature. Being familiar with gcc
as a user, and the compiler producing reasonable code for the ternary
"what the fuck is -- what?" target supported by GCC, I wanted to have a
look under the surface, fix bugs, add mini-optimizations for which I am
reasonably sure that they don't break too much, etc.
About 1 1/2 years elapsed from my first request for a copyright
assignment form (for private!) until this document was sent to me by the
FSF...
It is true that features of a software are important, same for stability
and usability. But what's also inevitable for a project like GCC is
that it's internals are comprehensible and the software is well designed
and potential contributors are welcome and don't stumble around alone in
the fog. Otherwise, the software will die sooner or later because it
will run out of developers and / or it turns unmaintainable.
The success of LLVM shows that there is market or need or whatever you
call it for a compiler, and if GCC does not improve his shortcomings, it
will lose, IMO.
GCC has historical ballast from around 25 years now. The situation of
internals documentation and internal structure (SSA, reasonable(!)
subset of C++, pass manager, RTL iterators, ...) improved a lot over the
last years and I completely agree with:
Richard Kenner wrote:
compilers are extremely complex programs and there's a limit to how
much even the best-written internals documentation can explain
Learning is effort -- both for the learner /and/ the teacher or the one
that (tries to) share his knowledge. The more complex a matter is, and
the more pitfalls and misleads-of-intuition there are, the more
important is this transfer of knowledge by the spoken word.
And GCC could get easier to read. One example:
The implementation language of GCC is not well suited to express code
transformations. (C vs. C++ makes not a big difference with code like
XEXP (XEXP (XEXP (a, 0), 0), 1)
vs.
a->xexp(1)->xexp(0)->xexp(0)
What's needed is a "language" that can neatly represent this, and that's
the reason why RTL is there: Nobody would even think about writing
insn-recog, insn-emit, insn-attrtab etc. by hand. It's all written in
RTL and transformed to the implementation language.
The C / C++ sources that transform / match / analyze trees and rtxes are
plain C. Reading these sources, nothing reminds you of the structure of
the code that is to be transformed / matched / analyzed. It's all
hand-coded in C and looks considerably different to a tree or RTL dump.
Describing transformations like "specific if-else" to "min" in a more
appropriate representation than big clauses, would greatly increase
legibility and maybe also stability and robustness, could add checking,
and most of these transformations would be self-explanatory to the reader.
Johann
--
"We could, of course, use any notation we want; do not laugh
at notations; invent them, they are powerful. In fact,
mathematics is, to a large extent, invention of better notations."
- R. Feynman