Re: Switching to C++ by default in 4.8

Chiheng Xu Thu, 12 Apr 2012 02:28:35 -0700

On Wed, Apr 11, 2012 at 10:24 AM, Lawrence Crowl <cr...@google.com> wrote:
> On 4/10/12, Jakub Jelinek <ja...@redhat.com> wrote:
>> That when stepping through code in the debugger you keep
>> enterring/exiting these one liner inlines, most of them really
>> should be at least by default considered just as normal statements
>> (e.g. glibc heavily uses artificial attribute for those, still
>> gdb doesn't hide those by default).
>
> You do want to step into those inline functions, except when you do.
> In the short term, we can make the debugger behave as though they did
> not exist.  In the longer term, we really want debugging tools that
> help C++ programmers.  One way to get there is to use C++ ourselves.
>
>> > The above is just quickly cooked up examples. A carefully
>> > designed C++ based API can be self documenting and make the
>> > client code very readable. It is hard to believe that there is
>> > no room for improvement in GCC.
>>
>> Do you have examples?  E.g. I haven't touched gold, because,
>> while it is a new C++ codebase, looks completely unreadable to
>> me, similarly libdw C++ stuff.  A carefully designed C based API
>> can be self documenting and make the code very readable as well,
>> often more so.
>
> If you just look at any decently sized code base, it'll look pretty
> much unreadable.  The question is how quickly can someone who learns
> the base vocabulary can produce reasonable modifications.
>
> There are many places where C++ can help substantially.  For example:
>
> () The C++ postfix member function call syntax means that following
> a chain of attributes is a linear read of the expression.  With C
> function call syntax, you need to read the expression inside out.
>
> () C++ has both overloaded functions and member functions, so you can
> use the same verb to talk about several different kinds of objects.
> With C function names, we have to invent a new function name for
> each type.  Such names are longer and burden both the author and
> the reader of the code.
>
> () Standard C++ idioms enable mashing program components with ease.
> The C++ standard library is based on mixing and matching algorithms
> and data structures, via the common idiom of iterators.
>
> () The overloadable operator new means that memory can be
> _implicitly_ allocated in the right place.
>
> () Constructors and destructors reduce the number of places in the
> code where you need to do explicit memory management. Without garbage
> collection, leaks are less frequent.  With garbage collection, you
> have much less active garbage, and can run longer between collection
> runs.  Indeed, a conservative collector would be sufficient.
>
> () Constructors and destructors also neatly handle actions that
> must occur in pairs.  The classic example is mutex lock and unlock.
> Within GCC, timevar operations need to happen in pairs.
>
> () Class hierarchies (even without virtual functions) can directly
> represent type relationships, which means that a debugger dump of
> a C++ type has little unnecessary information, as opposed to the
> present union of structs approach with GCC trees.
>
> () Class hierarchies also mean that programmers can distinguish
> in the pointer types that a function needs a decl parameter,
> without having to say 'all trees' versus 'a very specific tree'.
> The static type checking avoids run-time bugs.
>
> I have written compilers in both C and C++.  I much prefer the
> latter.
>


What you said sounds correct(mostly) for me. But I think the big
benefit of C++ (or any other modern language that support OO design)
is that C++ is more consistent with modern software engineering
practice : high cohesion and low coupling. C++ allow you to write
excellent code more easily than C. Actually, you don't need to write
C++ code to use C++.  I think you compiler guys should know very well
how each line of C++ code is translated to C code, just as C
programmers normally know very well how each line of C code is
translated to assembly code. So, using which language is not a big
deal. It is all about the methodology, the style. You can think in
C++, imaging the classes, objects in mind, and use your brain to
translate this "in-brain" code to C++ or C code, whatever you like.

The reason why GCC's code is very hard to hack is not simple. In part,
this is because GCC use a very old, extremely hard to understand build
system. In part, this is because GCC developer are more focused on
fixing bugs or adding new features, rather than re-factoring GCC's
code itself.  For example, for a .c file that have 15 years old,
people tend to fix its bugs to make it more and more ugly, rather to
rewrite it.

But I think the big reason is that, GCC tend to have extremely large
.c files, which is typical > 6000 LOC. If you look at LLVM, there are
rarely source code files that is > 2000 LOC.  Typical LLVM source code
files have 1000~2000 LOC.  Just separating  a source code file of 6000
LOC to several small files or file sections of 1000 LOC can improve
the code significantly.  Why has this not been done before ?  GCC
developers are reluctant to re-factoring their code may be the reason.
And, as the .c file grows, it become even harder to re-factor.
Thinking in C++ can help you write smaller, easier to understand,
easier to maintain code(C or C++), which have high cohesion and low
coupling.

And I think the file names of GCC's source can also be changed more
friendly to newbies, using some notion of FQN(fully qualified name)
may be good.


As for plug-in API, I think using C style API is OK. Thinking of Win32
API, the API is C, but it supports C++ notion of
object/encapsulation/polymorphism, so you can easily write wrapper API
in C++, namely MFC. I mean , to provide a C style API and provide a
C++ wrapper library for this API, then you can use both C and C++ in
you plug-in.

As for experimenting C++ in GCC, I suggest , at first, using C++ only
in the internal of a pass implementation or a module,  not exposing
C++ interface to other part of code. Namely, the interfaces between
between modules are still C,  but he implementations can be written in
either C or C++ or both.

And I predict that C++ will not have any positive impact on
performance(compile time or run time).

-- 
Chiheng Xu

Re: Switching to C++ by default in 4.8

Reply via email to