Hi,

  I spent the week-end trying to get GCC -- mainline -- compilable
(i.e. those compoenents written in C) with a C++ compiler (e.g. g++).

My summary is:  It is largely doable and it is within our reach at this
point of development.  More specifically, I successfully got all
files necessary to build a native GNU C compiler on an i686-pc-linux-gnu.
Attempt to get the GNU C++ compiler through the same massage is
underway (but I'm going to bed shortly ;-)).

I think this project is beneficial to GCC for several reasons:

  (1) for testing purposes, we can use a compiler with stricter type
      checking.

  (2) there have been lots of discussions about more static typing in
      the data structures, but so far we haven't made anything
      concrete.  Partly because, we need this sort of preliminary
      preparation of thee source tree.  We can have infinite debates
      about the merits of such approaches, but I think a way to know
      is to do actual experiments and we better start making that
      possible now.

  (3) It might open the door for more contributions and foster more
      free software based on GCC.

  (4) <insert your favorite reasons why you would like to see this happen>.


What I have learnt from this little experience.  Well, the source code
seems to have been carefully written to make sure that no lunatic
(e.g. the author of this writing) will succeed in feeding a C++
compiler with GCC :-)

The first resistance seems to come from the pervasive use of the implicit
conversion void* -> T*, mostly with storage allocating functions.
We've recently introduced C++ friendly macros in libiberty, but we
have yet to take advantage of them.  We should start now.
(I also noted a happy confusion about the calling convention of the
function [x]calloc(), but it is mostly harmless as everything "multiply
nicely" in the end and we don't get burned by strict alignment
issues).  We should generalize the notation for GGC allocators and
alloca(). 


The second resistance is the pervasive use of C++ keywords (e.g. new,
class, template, try, catch, ...).  The first three are quite
frequent in the middle-end.


Third, there is some "type-punning" with enums, int and unsigned int,
where the middle-end (mostly) relies on implicit conversion from int
to enums.  That is a bit annoying but could be avoided as most of the
time, we do have names for those integer constants.  For example, we
should be using EXPAND_NORMAL instead of 0, or VOIDmode, instead of
0, TV_TOTAL instead of 0, etc.  At this point, I should also note that
not implicit conversions between enums (c_tree_code <-> tree_code, or
rtx_code  <-> reg_note, etc.) is not supported in C++.  So, we should
probably arrange to make the relationship (mostly subsetting) between
more explicit, as opposed to throwing in casts.  Also, there are few
cases where we want to iterate over all the values of enumerations.
I've shamelessly used the following macros:

   #define NEXT(E)  ((__typeof__(E)) (E + 1))
   #define PREV(E)  ((__typeof__(E)) (E - 1))
   #define DECR(E)  (E = (__typeof__(E)) (E - 1))
   #define INCR(E)  (E = (__typeof__(E)) (E + 1))
   #define IOR(A,B) ((__typeof__(A)) (A | B))
   #define AND(A,B) ((__typeof__(A)) (A & B))
   #define XOR(A,B) ((__typeof__(A)) (A ^ B))

but I'm not suggesting that as real replacement; just reporting the
dirty tricks I did and I'm looking for better suggestions. 


Fourth, it appears that we're implicilty using C99's semantics of 
"extern inline" in our source -- when we have a pure C90 compiler that
does not understand "inline", we just #define inline to nothing so we
don't get into trouble.  With a C++ compiler, we're in trouble because
an inline function needs to be defined in every translation where it
is used.  So, I either move the affected functions to "static inline"
or just make then non-inline (cases are in hashtable.c and toplev.c).


Fifth, there is a slight difference between "const" in C and in C++.
In C++, a const variable implicitly has an internal linkage; so a
C++ compiler tends to optimize it out when its address is not taken
(so no storage is wasted).  This is an issue for the objects
automatically generated by the gengtype support machinery.  The are
supposed to have external linkage, so we need to explicitly say
"extern" in their definitions. 


Sixth, there is a real "mess" about name spaces.  It is true that
every C programmers knows the rule saying tags inhabit different name
space than variable of functions.  However, all the C coding standards
I've read so far usually suggest 

   typedef struct foo foo;

but *not*

   typedef struct foo *foo;

i.e. "bringing" the tag-name into normal name space to name the type
structure or enumeration is OK, but not naming a different type!  the
latter practice will be flagged by a C++ compiler.  I guess we may
need some discussion about the naming of structure (POSIX reserves
anything ending with "_t", so we might want to choose something so
that we don't run into problem.  However, I do not expect this issue
to dominate the discussion :-))


So, if various components maintainers (e.g. C and C++, middle-end,
ports, etc.)  are willing to help quickly reviewing patches we can
have this done for this week (assuming mainline is unslushed soon).
And, of course, everybody can help :-)

Thanks,


-- 
                                                        Gabriel Dos Reis
                                                         [EMAIL PROTECTED]
        Texas A&M University -- Department of Computer Science
        301, Bright Building -- College Station, TX 77843-3112

Reply via email to