Kenneth Zadeck writes:
This quickly becomes
difficult and messy, which is presumably why the link-time proposal
allows the linker to "give up" linking two translation units.
The reason for the complexity of the type system handling in our
proposal was motivated primarily by two concerns:
1) We need to keep track of the types so that they are available for
debugging.
...
loosing all the type information before you start did not seem the
correct plan.
This is exactly my point. The problem here is NOT the fact that the
optimization representation can't represent everything that the debug
information does. The problem is that your approach conflates two
completely separate pieces of information. Consider:
1. Debug information must represent the source-level program with 100%
fidelity. At link time, the debug information must be merged (and
perhaps optimized for size), but you do not need to merge declarations
or types across language boundaries, or in non-trivial cases.
2. An optimization representation need not (and if fact does not want to)
represent the program at the source-level. However, it *must* be able
to link declarations and types across modules and across languages
without exception (otherwise, you will miscompile the program).
Designing a representation where this is not practically possible
requires a back-off mechanism as your proposal has outlined.
To me, the correct solution to this problem is to not try to combine the
representations. Instead, allow the debug information to capture the
important information that it does well (e.g. types and declarations in a
language-specific way) and allow the optimization representation to
capture the semantics of the program in a way that is as useful for
optimization and codegen purposes as possible.
This approach is the one we have always taken with LLVM (except of course
that we have been missing debug info, because noone got around to
implementing it), which might explain some of the confusion around
"lacking high-level information".
I personally cannot guarantee that GCC (or for that matter any
optimizing compiler) can correctly cross inline and compile a program if
the types in one module are not consistent with the types in another
module. Just because the program happens to work correctly when
separately compiled is not enough.
This is a direct result of the representation that you are proposing to
use for IPA. LLVM is *always* capable of merging two translation units
correctly, no matter where they came from. We do this today. If you look
back to my 2003 GCC summit paper (Sec4.4), I mention the fact that this is
not a trival problem. :)
When Mark and I started working on this proposal (and later the
rest of the volunteers) we decided that this was not going to be
either an academic exercise or just something to run benchmarks.
I'm glad. While IMA is an interesting step in the right direction, it has
not seen widespread adoption for this reason. I'm glad that your goal is
to design something like LLVM, which always works.
What that means to me is that the link time optimizer needs to be
able to either generate correct code or give up in some predictable
manner. Having the compiler push forward and hope everything
turns out OK is not enough. Discretion is the better part
of valor.
I prefer to design the compiler so that neither 'giving up' nor 'hope' is
required. This is an easily solvable problem, one that LLVM has had right
for several years now.
I think that taking advantage of mixed C, C++ or C and Fortran
programs is going to be hard.
I don't agree.
But it is what the GCC customers want and there is a desire to
accommodate them if possible.
Outside benchmarks, many programs are made up of different language
components. There are of course the trivial cases (such as optimizing
across JNI/CNI/Java and C/C++ code), but many programs, particularly large
ones, have pieces written in multiple languages. I believe Toon was
recently talking about his large weather program written in Fortran and C
(though I could be confusing Toon's program with another one).
-Chris
--
http://nondot.org/sabre/
http://llvm.org/