Kenneth Zadeck writes:
This quickly becomes
difficult and messy, which is presumably why the link-time proposal
allows the linker to "give up" linking two translation units.

The reason for the complexity of the type system handling in our
proposal was motivated primarily by two concerns:

1) We need to keep track of the types so that they are available for
   debugging.
...
loosing all the type information before you start did not seem the correct plan.

This is exactly my point. The problem here is NOT the fact that the optimization representation can't represent everything that the debug information does. The problem is that your approach conflates two completely separate pieces of information. Consider:

1. Debug information must represent the source-level program with 100%
   fidelity.  At link time, the debug information must be merged (and
   perhaps optimized for size), but you do not need to merge declarations
   or types across language boundaries, or in non-trivial cases.
2. An optimization representation need not (and if fact does not want to)
   represent the program at the source-level.  However, it *must* be able
   to link declarations and types across modules and across languages
   without exception (otherwise, you will miscompile the program).
   Designing a representation where this is not practically possible
   requires a back-off mechanism as your proposal has outlined.

To me, the correct solution to this problem is to not try to combine the representations. Instead, allow the debug information to capture the important information that it does well (e.g. types and declarations in a language-specific way) and allow the optimization representation to capture the semantics of the program in a way that is as useful for
optimization and codegen purposes as possible.

This approach is the one we have always taken with LLVM (except of course that we have been missing debug info, because noone got around to implementing it), which might explain some of the confusion around "lacking high-level information".

I personally cannot guarantee that GCC (or for that matter any optimizing compiler) can correctly cross inline and compile a program if the types in one module are not consistent with the types in another module. Just because the program happens to work correctly when separately compiled is not enough.

This is a direct result of the representation that you are proposing to use for IPA. LLVM is *always* capable of merging two translation units correctly, no matter where they came from. We do this today. If you look back to my 2003 GCC summit paper (Sec4.4), I mention the fact that this is not a trival problem. :)

When Mark and I started working on this proposal (and later the
rest of the volunteers) we decided that this was not going to be
either an academic exercise or just something to run benchmarks.

I'm glad. While IMA is an interesting step in the right direction, it has not seen widespread adoption for this reason. I'm glad that your goal is to design something like LLVM, which always works.

What that means to me is that the link time optimizer needs to be
able to either generate correct code or give up in some predictable
manner.  Having the compiler push forward and hope everything
turns out OK is not enough.  Discretion is the better part
of valor.

I prefer to design the compiler so that neither 'giving up' nor 'hope' is required. This is an easily solvable problem, one that LLVM has had right for several years now.

I think that taking advantage of mixed C, C++ or C and Fortran
programs is going to be hard.

I don't agree.

But it is what the GCC customers want and there is a desire to accommodate them if possible.

Outside benchmarks, many programs are made up of different language components. There are of course the trivial cases (such as optimizing across JNI/CNI/Java and C/C++ code), but many programs, particularly large ones, have pieces written in multiple languages. I believe Toon was recently talking about his large weather program written in Fortran and C (though I could be confusing Toon's program with another one).

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

Reply via email to