Hi Philip,

thanks for writing your experiences, I found it very useful. I certainly like the idea of having such a thread every once in a while, just to keep everyone updated about our projects. I'm also curious to learn about the experiences of other students that are writing code for GCC for the first time. Here goes mine:

I am Dimitrios Apostolou (jimis on IRC) and I live in Heraklion-Crete, Greece. My project concerns making GCC leaner and faster.

Reading GCC codebase has been a hard exercise for me. In fact it's the only project I know of that becomes more and more difficult as time passes... I will try to describe some of the major hurdles I've faced so far.

I started by profiling the execution of cc1, the C compiler. In general I could find no big hot-spot, it was in good shape, but I could see 3-4 areas that could make some difference if improved (for example hash tables, assembly output, C parser, bitmaps). But diving in and trying to change things is a completely different story.

Minor tweaks are easy to make, but usually have minor impact. If you want to see bigger speedup you have to break the interface of functions being used in hundreds of places, and that is hard. Sometimes it was impossible for me, I was getting crashes in places far away of code I had changed, so I ended up reverting to original versions.

Spending some time with a specific part of GCC's codebase gives you the ability to dive deeper and work more efficiently. But that is the point when I usually have achieved something and I must move on to some other part. And the whole GCC codebase is so huge, that understanding one part means nothing when you move to another. My advice here is that if your project permits that, touch as little code as possible in GCC, and be really proficient with that. Treat the rest as a black box, or you'll spend too much time trying to understand everything.

Another hurdle is the usage of too many macros. Even if they exist for making the code easier to read, I can't see how they achieve this in a few extreme cases. I have had gdb expand 20 full-lines macros on a wide screen. Plus the profiler can't actually profile code in macros, so the impact of some data structures in performance is hidden that way. My moments of greatest awe/horror so far have been while changing things in vectors (vec.[ch]), which is actually a fully templated structure implemented in CPP!

Finally I believe that some parts of the compiler should have a big NO-ENTRY flag for beginners. In my case, after having improved little stuff in assembly output and hash tables, I decided -driven by profiler's output- to try improving things in dataflow analysis part of GCC. It's true that there is much to be improved there but it requires a good understanding of this complex part. Three weeks later I am still striving to change simple stuff and jump to the next part, but regressions I've introduced don't allow me to do so yet. The level of my understanding of this part is still basic, I've now only scratched the surface of Dataflow Analysis. If I had this knowledge in the beginning I'd probably leave that part for the end of the summer, if at all. My plans included visiting IRA (register allocator) next, but I think I'll skip directly to the c-parser which I understand more.


These are my major difficulties with GCC, I'm curious to learn about other students experience so far. Of course don't get the wrong impression, my general feeling on GCC development is positive, the community is helpful and really friendly inspite of my daily spamming on the IRC. :-p

In the end I feel the fact that GCC is a multi-headed monster makes it even more exciting to try and tame it.


Good luck in everyone's project,
Dimitris

Reply via email to