Much of the documentation from design documents, papers and presentations need to be ported to the internals manual.
Also, from http://gcc.gnu.org/ml/gcc-patches/2009-09/msg02134.html (A) Details in the patch. (B) Details missing from the patch. (C) Lack of general explanation of how to use LTO and what should or should not work. Comments (A): > +Enable link-time optimization (LTO). This is enabled by default if a You mean, build that support into the compiler (not enable it by default once built in). > +working libelf implemetnation is found (see @option{--with-libelf}). "implementation" > diff -rdupN --exclude=.svn --exclude=.git --exclude='*.diff*' > --exclude='autom4te*' --exclude=tags --exclude=ChangeLog.lto > --exclude=configure > /usr/local/google/homedirs/dnovillo/gcc/trunk/gcc/doc/invoke.texi > /usr/local/google/homedirs/dnovillo/gcc/trunk.lto/gcc/doc/invoke.texi > --- /usr/local/google/homedirs/dnovillo/gcc/trunk/gcc/doc/invoke.texi > 2009-09-25 15:23:18.000000000 -0400 > +++ /usr/local/google/homedirs/dnovillo/gcc/trunk.lto/gcc/doc/invoke.texi > 2009-09-25 11:21:16.000000000 -0400 > @@ -349,7 +349,7 @@ Objective-C and Objective-C++ Dialects}. > -fno-ira-share-spill-slots -fira-verbo...@var{n} @gol > -fivopts -fkeep-inline-functions -fkeep-static-consts @gol > -floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol > --floop-parallelize-all @gol > +-floop-parallelize-all -fltrans -fltrans-output-list @gol > -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol > -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol > -fmudflapir -fmudflapth -fno-branch-count-reg -fno-default-inline @gol > @@ -389,7 +389,7 @@ Objective-C and Objective-C++ Dialects}. > -funit-at-a-time -funroll-all-loops -funroll-loops @gol > -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol > -fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol > --fwhole-program @gol > +-fwhole-program -fwpa @gol > --param @var{nam...@var{value} > -O -O0 -O1 -O2 -O3 -Os} The options -flto -fwhopr -flto-compression-level need adding to the summary list of options. > +...@item -fwhopr > +...@opindex fwhopr > +This option is similar to @option{-flto} but it differs in how > +the final link stage is executed. Instead of loading all the > +function bodies in memory, the callgraph is analyzed and > +optimization decisions are made (whole program analysis or WPA). > +Once optimization decisions are made, the callgraph is > +partitioned and the different sections are compiled separately > +(local transformations or LTRANS). This process allows "LTRANS)@." for correct spacing. > +Disabled by default. This option is only supported by the LTO frontend. "front end" > +Disabled by default. This option is only supported by the LTO frontend. "front end" > +...@item -flto-report > +This option is only useful when processing object files in LTO > +mode (via -fwhopr or -flto). @option{-fwhopr}, @option{-flto}. Comments (B): There are three new configure options, -with-libelf, --with-libelf-include and --with-libelf-lib, added by patch 2. All of these need documenting alongside the other configure options; a cross-reference to --with-libelf from the list of prerequisites is not sufficient, the option itself needs documenting. --enable-gold is effectively a new configure option to enable the plugin, and needs documenting as such. (My understanding is that the plugin might in principle be usable with other linkers rather than being strongly tied to gold only, so that isn't necessarily the best spelling of the configure option.) I would expect the new directories to be documented in sourcebuild.texi. I would expect information about LTO contributors to be added to contrib.texi. I would think the LTO functionality needs one or more maintainers or reviewers appointed by the SC, if they haven't already been appointed, who should be added to MAINTAINERS. There should be some explicit statement in install.texi that this functionality is only supported for ELF targets. The documentation of -flto-report in invoke.texi is clearly inadequate. Documentation should say what the option does; "only useful when" information may be appropriate as part of the documentation, but not the whole as it is at present. Comments (C): The user documentation fails to address LTO from a user's perspective. It describes implementation details, but does not explain how or when to use or not to use the functionality. Let's consider the options documented: * -flto, -fwhopr, -fwpa, -fltrans: the documentation at least says something about what the options do. It says nothing about whether they are to be passed at compile time, link time, or both, and how passing different combinations of options at different times interacts. What does "only supported by the LTO frontend" mean from the user's perspective? The very idea that LTO is a front end is the implementation perspective. The user compiles C, or C++, or Fortran, or some other language, using the respective front ends for those languages (maybe with special options, if they are compile-time options, but if an LTO front end is involved that's an implementation detail). They then run the compiler (driver) to link the objects (maybe with special options, if they are link-time options) and the compiler and linker do something with the previously generated objects. Or they run a single command to compile and link. In any case, the concept of an LTO front end is irrelevant to the user. If you mean that the option is only supported when linking, say that. If you mean that the option is only used internally by the compiler and should not be passed directly by the user, say that. In any case, make it clear when the user might wish to pass each option. * -fltrans-output-li...@var{file}: why would the user want a file "to which the names of LTRANS output files are written"? How would they use such a file after generating it? What format is it? Where does it go if this option is not specified and why would the user need to change this? If it's an implementation detail and the user doesn't generally need to care, explain that. And what are "LTRANS output files"? Where do they go? Does the compiler clean them up or does the user need to do that? * -flto-compression-lev...@var{n}: at least a user can reasonably see there is a speed/space trade-off in compression. But this should not be referenced to a particular host-side library that happens to be used by GCC right now. The documentation should explain the semantics to the user directly: 1 for fastest, 9 for smallest, 0 for no compression. * -flto-report: as noted above, the documentation of this option is completely semantics-free and so says nothing whatever of use to a GCC user. So much for the individual options. What about the story for users? Users should be able to read the user manual, and from it get a clear idea of how to use LTO and what cases will work, what will give an error (or sorry ()) as not being supported, what will ICE and what will quietly behave incorrectly. Here are some examples of questions about LTO. They certainly don't all need to have the answer "this will work perfectly" - though giving an error is always strongly to be preferred to an ICE or quiet wrong code - but it should be clear to users from the manual what will or will not work. And if something won't work but is intended to in future, there should be clear PRs in Bugzilla or todo list that include those issues. * Say I wish to build a program using LTO. What options should I use when compiling objects for that program? LTO options? Normal CPU selection and optimization options? What options should I use when linking? How do I choose between the several different LTO options listed? What combinations of them at different stages are valid? Are there any issues with linking with non-LTO libraries? * Similarly, building a shared library using LTO. * What happens if some objects are built with LTO information and some without (including those without being built with non-GCC compilers)? * What if objects with LTO information (or a mixture of those with and without) are put in a .a archive - either linked as a library, or linked with --whole-archive? * What happens if I (or my program's build system) does a partial link with gcc -r? Will this work OK with objects with LTO information? A mixture of objects with and without LTO information? What about if it does a partial link with ld -r, bypassing the compiler driver - will the resulting object still work with LTO optimizations? * Will an object with LTO information still contain normal object code? Normal object code fully optimized with the single-file optimizations specified when compiling? This is needed if the object is to be usable with a non-LTO or non-GCC compiler. * How portable are objects with LTO information? What happens when linking together objects built with different versions of GCC? Objects built with the same major version but different minor versions/patches (e.g. built on different GNU/Linux distributions)? Will incompatibilities be reliably detected? Will objects built for the same target on different hosts, including different endiannesses of host and some hosts being 32-bit, some being 64-bit, interoperate properly? What about if objects - that would be compatible but for LTO - are built with differently configured compilers? For example, I can take an object built with an i686-pc-linux-gnu compiler, no special options, and one built with an x86_64-unknown-linux-gnu compiler, -m32, and link them together using either compiler. Will this work with LTO as long as the options passed to the compiler when linking do select 32-bit mode? All these things are relevant to how feasible it will be to distribute libraries that include LTO information. * Are there particular things about objects in the program that will inhibit LTO optimizations, either globally or for a particular object (as if it did not have LTO information)? Toplevel asms? -fno-toplevel-reorder? Particular combinations of options or declarations in different objects? * When can I use different options for different objects being optimized together? Can I build just one object with -frounding-math and have that work? Different objects with -fwrapv and -ftrapv? Suppose my program has multiple versions of a function built for different CPUs and a dispatching function - or use of STT_GNU_IFUNC - to select one at runtime based on the CPU in use. Will compiling different files with different CPU options, or using the "target" attribute, work properly with LTO, at least as long as the CPU-specific functions are marked noinline so the compiler can know not to move CPU-specific functionality into the caller before the CPU checks? I would expect some changes to be needed to passes.texi as well to discuss LTO. Certainly, it would be good to have some sort of overview of the workings of LTO and how data is arranged in object files that is checked into the GCC sources and branched along with GCC, whether in the internals manual or in comments in the sources (I haven't yet looked at the patches with the bulk of the LTO sources to see if there are suitable comments there); a wiki page that has moved on to describing new arrangements for 4.6 is of less use when fixing a bug on 4.5 branch. But this is secondary to getting a proper description for users of how to use LTO. -- Summary: LTO needs better internal and user documentation Product: gcc Version: lto Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto AssignedTo: dnovillo at gcc dot gnu dot org ReportedBy: dnovillo at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41528