[Bug lto/41528] New: LTO needs better internal and user documentation

dnovillo at gcc dot gnu dot org Wed, 30 Sep 2009 13:28:19 -0700

Much of the documentation from design documents, papers and presentations need
to be ported to the internals manual.


Also, from http://gcc.gnu.org/ml/gcc-patches/2009-09/msg02134.html

(A) Details in the patch.

(B) Details missing from the patch.

(C) Lack of general explanation of how to use LTO and what should or 
should not work.

Comments (A):

> +Enable link-time optimization (LTO).  This is enabled by default if a

You mean, build that support into the compiler (not enable it by default 
once built in).

> +working libelf implemetnation is found (see @option{--with-libelf}).

"implementation"

> diff -rdupN --exclude=.svn --exclude=.git --exclude='*.diff*' 
> --exclude='autom4te*' --exclude=tags --exclude=ChangeLog.lto 
> --exclude=configure 
> /usr/local/google/homedirs/dnovillo/gcc/trunk/gcc/doc/invoke.texi 
> /usr/local/google/homedirs/dnovillo/gcc/trunk.lto/gcc/doc/invoke.texi
> --- /usr/local/google/homedirs/dnovillo/gcc/trunk/gcc/doc/invoke.texi 
> 2009-09-25 15:23:18.000000000 -0400
> +++ /usr/local/google/homedirs/dnovillo/gcc/trunk.lto/gcc/doc/invoke.texi     
> 2009-09-25 11:21:16.000000000 -0400
> @@ -349,7 +349,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fno-ira-share-spill-slots -fira-verbo...@var{n} @gol
>  -fivopts -fkeep-inline-functions -fkeep-static-consts @gol
>  -floop-block -floop-interchange -floop-strip-mine -fgraphite-identity @gol
> --floop-parallelize-all @gol
> +-floop-parallelize-all -fltrans -fltrans-output-list @gol
>  -fmerge-all-constants -fmerge-constants -fmodulo-sched @gol
>  -fmodulo-sched-allow-regmoves -fmove-loop-invariants -fmudflap @gol
>  -fmudflapir -fmudflapth -fno-branch-count-reg -fno-default-inline @gol
> @@ -389,7 +389,7 @@ Objective-C and Objective-C++ Dialects}.
>  -funit-at-a-time -funroll-all-loops -funroll-loops @gol
>  -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol
>  -fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol
> --fwhole-program @gol
> +-fwhole-program -fwpa @gol
>  --param @var{nam...@var{value}
>  -O  -O0  -O1  -O2  -O3  -Os}

The options -flto -fwhopr -flto-compression-level need adding to the 
summary list of options.

> +...@item -fwhopr
> +...@opindex fwhopr
> +This option is similar to @option{-flto} but it differs in how
> +the final link stage is executed.  Instead of loading all the
> +function bodies in memory, the callgraph is analyzed and
> +optimization decisions are made (whole program analysis or WPA).
> +Once optimization decisions are made, the callgraph is
> +partitioned and the different sections are compiled separately
> +(local transformations or LTRANS).  This process allows

"LTRANS)@." for correct spacing.

> +Disabled by default.  This option is only supported by the LTO frontend.

"front end"

> +Disabled by default.  This option is only supported by the LTO frontend.

"front end"

> +...@item -flto-report
> +This option is only useful when processing object files in LTO
> +mode (via -fwhopr or -flto).

@option{-fwhopr}, @option{-flto}.

Comments (B):

There are three new configure options, -with-libelf, --with-libelf-include 
and --with-libelf-lib, added by patch 2.  All of these need documenting 
alongside the other configure options; a cross-reference to --with-libelf 
from the list of prerequisites is not sufficient, the option itself needs 
documenting.  --enable-gold is effectively a new configure option to 
enable the plugin, and needs documenting as such.  (My understanding is 
that the plugin might in principle be usable with other linkers rather 
than being strongly tied to gold only, so that isn't necessarily the best 
spelling of the configure option.)

I would expect the new directories to be documented in sourcebuild.texi.  
I would expect information about LTO contributors to be added to 
contrib.texi.  I would think the LTO functionality needs one or more 
maintainers or reviewers appointed by the SC, if they haven't already been 
appointed, who should be added to MAINTAINERS.

There should be some explicit statement in install.texi that this 
functionality is only supported for ELF targets.

The documentation of -flto-report in invoke.texi is clearly inadequate.  
Documentation should say what the option does; "only useful when" 
information may be appropriate as part of the documentation, but not the 
whole as it is at present.

Comments (C):

The user documentation fails to address LTO from a user's perspective.  It 
describes implementation details, but does not explain how or when to use 
or not to use the functionality.  Let's consider the options documented:

* -flto, -fwhopr, -fwpa, -fltrans: the documentation at least says 
something about what the options do.  It says nothing about whether they 
are to be passed at compile time, link time, or both, and how passing 
different combinations of options at different times interacts.  What does 
"only supported by the LTO frontend" mean from the user's perspective?  
The very idea that LTO is a front end is the implementation perspective.  
The user compiles C, or C++, or Fortran, or some other language, using the 
respective front ends for those languages (maybe with special options, if 
they are compile-time options, but if an LTO front end is involved that's 
an implementation detail).  They then run the compiler (driver) to link 
the objects (maybe with special options, if they are link-time options) 
and the compiler and linker do something with the previously generated 
objects.  Or they run a single command to compile and link.  In any case, 
the concept of an LTO front end is irrelevant to the user.  If you mean 
that the option is only supported when linking, say that.  If you mean 
that the option is only used internally by the compiler and should not be 
passed directly by the user, say that.  In any case, make it clear when 
the user might wish to pass each option.

* -fltrans-output-li...@var{file}: why would the user want a file "to 
which the names of LTRANS output files are written"?  How would they use 
such a file after generating it?  What format is it?  Where does it go if 
this option is not specified and why would the user need to change this?  
If it's an implementation detail and the user doesn't generally need to 
care, explain that.  And what are "LTRANS output files"?  Where do they 
go?  Does the compiler clean them up or does the user need to do that?

* -flto-compression-lev...@var{n}: at least a user can reasonably see 
there is a speed/space trade-off in compression.  But this should not be 
referenced to a particular host-side library that happens to be used by 
GCC right now.  The documentation should explain the semantics to the user 
directly: 1 for fastest, 9 for smallest, 0 for no compression.

* -flto-report: as noted above, the documentation of this option is 
completely semantics-free and so says nothing whatever of use to a GCC 
user.

So much for the individual options.  What about the story for users?  
Users should be able to read the user manual, and from it get a clear idea 
of how to use LTO and what cases will work, what will give an error (or 
sorry ()) as not being supported, what will ICE and what will quietly 
behave incorrectly.  Here are some examples of questions about LTO.  They 
certainly don't all need to have the answer "this will work perfectly" - 
though giving an error is always strongly to be preferred to an ICE or 
quiet wrong code - but it should be clear to users from the manual what 
will or will not work.  And if something won't work but is intended to in 
future, there should be clear PRs in Bugzilla or todo list that include 
those issues.

* Say I wish to build a program using LTO.  What options should I use when 
compiling objects for that program?  LTO options?  Normal CPU selection 
and optimization options?  What options should I use when linking?  How do 
I choose between the several different LTO options listed?  What 
combinations of them at different stages are valid?  Are there any issues 
with linking with non-LTO libraries?

* Similarly, building a shared library using LTO.

* What happens if some objects are built with LTO information and some 
without (including those without being built with non-GCC compilers)?

* What if objects with LTO information (or a mixture of those with and 
without) are put in a .a archive - either linked as a library, or linked 
with --whole-archive?

* What happens if I (or my program's build system) does a partial link 
with gcc -r?  Will this work OK with objects with LTO information?  A 
mixture of objects with and without LTO information?  What about if it 
does a partial link with ld -r, bypassing the compiler driver - will the 
resulting object still work with LTO optimizations?

* Will an object with LTO information still contain normal object code?  
Normal object code fully optimized with the single-file optimizations 
specified when compiling?  This is needed if the object is to be usable 
with a non-LTO or non-GCC compiler.

* How portable are objects with LTO information?  What happens when 
linking together objects built with different versions of GCC?  Objects 
built with the same major version but different minor versions/patches 
(e.g. built on different GNU/Linux distributions)?  Will incompatibilities 
be reliably detected?  Will objects built for the same target on different 
hosts, including different endiannesses of host and some hosts being 
32-bit, some being 64-bit, interoperate properly?  What about if objects - 
that would be compatible but for LTO - are built with differently 
configured compilers?  For example, I can take an object built with an 
i686-pc-linux-gnu compiler, no special options, and one built with an 
x86_64-unknown-linux-gnu compiler, -m32, and link them together using 
either compiler.  Will this work with LTO as long as the options passed to 
the compiler when linking do select 32-bit mode?  All these things are 
relevant to how feasible it will be to distribute libraries that include 
LTO information.

* Are there particular things about objects in the program that will 
inhibit LTO optimizations, either globally or for a particular object (as 
if it did not have LTO information)?  Toplevel asms?  
-fno-toplevel-reorder?  Particular combinations of options or declarations 
in different objects?

* When can I use different options for different objects being optimized 
together?  Can I build just one object with -frounding-math and have that 
work?  Different objects with -fwrapv and -ftrapv?  Suppose my program has 
multiple versions of a function built for different CPUs and a dispatching 
function - or use of STT_GNU_IFUNC - to select one at runtime based on the 
CPU in use.  Will compiling different files with different CPU options, or 
using the "target" attribute, work properly with LTO, at least as long as 
the CPU-specific functions are marked noinline so the compiler can know 
not to move CPU-specific functionality into the caller before the CPU 
checks?


I would expect some changes to be needed to passes.texi as well to discuss 
LTO.  Certainly, it would be good to have some sort of overview of the 
workings of LTO and how data is arranged in object files that is checked 
into the GCC sources and branched along with GCC, whether in the internals 
manual or in comments in the sources (I haven't yet looked at the patches 
with the bulk of the LTO sources to see if there are suitable comments 
there); a wiki page that has moved on to describing new arrangements for 
4.6 is of less use when fixing a bug on 4.5 branch.  But this is secondary 
to getting a proper description for users of how to use LTO.


-- 
           Summary: LTO needs better internal and user documentation
           Product: gcc
           Version: lto
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
        AssignedTo: dnovillo at gcc dot gnu dot org
        ReportedBy: dnovillo at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41528

[Bug lto/41528] New: LTO needs better internal and user documentation

Reply via email to