Several people have asked for an update on the status of the LTO project, so Kenny and I have put together a summary of what we believe the status and remaining issues to be.
The short story is that, unfortunately, we have not had as much time as we would have liked to make progress on LTO. Kenny has been working on the dataflow project, and I have had a lot of other things on my plate as well. So -- as always! -- we would be delighted to have other people helping out. (One kind person asked me if contributing to LTO would hurt CodeSourcery by potentially depriving us of contracts. I doubt that very much, and I certainly don't think that should stop anybody from contributing to LTO!) I still think that LTO is a very important project, and that the design outline we have is sound. I think that a relatively small amount of work (measured in terms of person-months) is required to get us to being able to handle most of C. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Introduction This document summarizes work remaining in the LTO front end to achieve the initial goal of correct operation on single-file C programs. Changes to the DWARF Reader Required The known limitations of the DWARF reader are as follows: * Modify 'LTO_READ_TYPE' to byte-swap data, as required for cross compiling for targets with different endianness. * The DWARF reader skips around in the DWARF tree to read types. It's possible that this will not work in situations with complex nesting, or that a fixup will be required later when the DIE is encountered again, during the normal walk. * Function-scope static variables are not handled. * Once more information about types is saved (see below), references to layout_type, etc., should be removed or modified, so that the saved data is not overritten. * Unprototyped functions are not handled. DWARF Extensions Required The following sections summarize augmentations we must make to the DWARF generated by GCC. GNU Attributes Semantic GNU attributes (e.g., dllimport) are not recorded in DWARF. Therefore, this information is lost. Type Information At present, the LTO front end recomputes some type attributes (like machine modes). However, there is no guarantee that the type attributes that are computed will match those in the original program. Because there is presently no method for encoding this information in DWARF, we need to take advantage of DWARF's extensibility features to add these representations. The type attributes which require DWARF extensions are: * Type alignment * Machine mode Declaration Flags There are some flags on 'FUNCTION_DECL' and 'VAR_DECL' nodes that may need to be preserved. Some may be implied by GNU attributes, but others are not. Here are the flags that should be preserved. Functions and Variables: * 'DECL_SECTION_NAME' * 'DECL_VISIBILITY' * 'DECL_ONE_ONLY. * 'DECL_COMDAT' * 'DECL_WEAK' * 'DECL_DLLIMPORT_P' * 'DECL_ASSEMBLER_NAME' Functions: * 'DECL_UNINLINABLE' * 'DECL_IS_MALLOC' * 'DECL_IS_RETURNS_TWICE' * 'DECL_IS_PURE' * 'DECL_IS_NOVOPS' * 'DECL_STATIC_CONSTRUCTOR' * 'DECL_STATIC_DESTRUCTOR' * 'DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT' * 'DECL_NO_LIMIT_STACK' * 'DECL_NO_STATIC_CHAIN' * 'DECL_INLINE' Variables: * 'DECL_HARD_REGISTER' * 'DECL_HAS_INIT_PRIORITY' * 'DECL_INIT_PRIORITY' * 'DECL_TLS_MODEL' * 'DECL_THREAD_LOCAL_P' * 'DECL_IN_TEXT_SECTION' * 'DECL_COMMON' Gimple Reader and Writer Current Status All gimple forms except for those related to gomp are now handled. It is believed that this code is mostly correct. The lto reader and the writer logically work after the ipa_cp pass. At this point, the program has been fully gimplified and is in fact in "low gimple". The reader is currently able to read in and recreate gimple, and the control flow graph. Much of the eh handing code has been written but not tested. The reader and writer can be compiled in a self checking mode so that the writer writes a text logging of what is is serializing into the object file. The lto reader uses the same logging library to produce a log of what it is reading. During reading, the process aborts if the logs get out of sync. The current state of the code is that much of the code to serialize the cfun has not been written or tested. Without this part of the code, nothing can be executed downstream of the reader without iceing. Remaining Work The ipa reader works as if it was just as another front end. While this is clearly the correct approach, it will require some additional engineering to actually work. Front ends are expected to connect to the system much earlier in the stack of passes than having low gimple. Some mechanism needs to be added so that the passes up to pass ipa_cp are skipped by the lto front end. As of when work was paused on the lto reader and writer, only a small amount of the type system was actually being gimplified. This meant that it was impossible to test much of the machinery on anything except the simplest examples. There are certainly bugs that have not been detected because of this. The front ends still generate some rtl. I do not know which front ends are still doing this, but there are no plans to serialize the rtl. Any front end that this will have to be fixed before they can play lto. Likewise, fields in the cfun like "struct language_function * language;" are extremely problematic. There are still lang hooks that must be replaced with explicit semantics in the IL.