Re: LLVM/GCC Integration Proposal

Chris Lattner Sat, 19 Nov 2005 14:22:30 -0800


Kenneth Zadeck writes:

This quickly becomes
difficult and messy, which is presumably why the link-time proposal
allows the linker to "give up" linking two translation units.

The reason for the complexity of the type system handling in our
proposal was motivated primarily by two concerns:

1) We need to keep track of the types so that they are available for
   debugging.

...

loosing all the type information before you start did not seem thecorrect plan.

This is exactly my point. The problem here is NOT the fact that theoptimization representation can't represent everything that the debuginformation does. The problem is that your approach conflates twocompletely separate pieces of information. Consider:


1. Debug information must represent the source-level program with 100%
   fidelity.  At link time, the debug information must be merged (and
   perhaps optimized for size), but you do not need to merge declarations
   or types across language boundaries, or in non-trivial cases.
2. An optimization representation need not (and if fact does not want to)
   represent the program at the source-level.  However, it *must* be able
   to link declarations and types across modules and across languages
   without exception (otherwise, you will miscompile the program).
   Designing a representation where this is not practically possible
   requires a back-off mechanism as your proposal has outlined.

To me, the correct solution to this problem is to not try to combine therepresentations. Instead, allow the debug information to capture theimportant information that it does well (e.g. types and declarations in alanguage-specific way) and allow the optimization representation tocapture the semantics of the program in a way that is as useful for

optimization and codegen purposes as possible.

This approach is the one we have always taken with LLVM (except of coursethat we have been missing debug info, because noone got around toimplementing it), which might explain some of the confusion around"lacking high-level information".

I personally cannot guarantee that GCC (or for that matter anyoptimizing compiler) can correctly cross inline and compile a program ifthe types in one module are not consistent with the types in anothermodule. Just because the program happens to work correctly whenseparately compiled is not enough.

This is a direct result of the representation that you are proposing touse for IPA. LLVM is *always* capable of merging two translation unitscorrectly, no matter where they came from. We do this today. If you lookback to my 2003 GCC summit paper (Sec4.4), I mention the fact that this isnot a trival problem. :)

When Mark and I started working on this proposal (and later the
rest of the volunteers) we decided that this was not going to be
either an academic exercise or just something to run benchmarks.

I'm glad. While IMA is an interesting step in the right direction, it hasnot seen widespread adoption for this reason. I'm glad that your goal isto design something like LLVM, which always works.

What that means to me is that the link time optimizer needs to be
able to either generate correct code or give up in some predictable
manner.  Having the compiler push forward and hope everything
turns out OK is not enough.  Discretion is the better part
of valor.

I prefer to design the compiler so that neither 'giving up' nor 'hope' isrequired. This is an easily solvable problem, one that LLVM has had rightfor several years now.

I think that taking advantage of mixed C, C++ or C and Fortran
programs is going to be hard.


I don't agree.

But it is what the GCC customers want and there is a desire toaccommodate them if possible.

Outside benchmarks, many programs are made up of different languagecomponents. There are of course the trivial cases (such as optimizingacross JNI/CNI/Java and C/C++ code), but many programs, particularly largeones, have pieces written in multiple languages. I believe Toon wasrecently talking about his large weather program written in Fortran and C(though I could be confusing Toon's program with another one).


-Chris

--
http://nondot.org/sabre/
http://llvm.org/

Re: LLVM/GCC Integration Proposal

Reply via email to