On 2016.09.26 at 09:42 +0200, Richard Biener wrote:
> On Sat, Sep 24, 2016 at 10:52 AM, Markus Trippelsdorf
> <mar...@trippelsdorf.de> wrote:
> > On 2016.09.23 at 15:29 +0200, Richard Biener wrote:
> >> >
> >> > So 50000 looks too big to me.
> >>
> >> I think the issue is that the default number of partitions is too high
> >> (32) which pessimizes 4-core machines if the units are too small.
> >
> > The more partitions are used the less memory is required at LTRANS time.
> >
> > If for example you limit partitions to 4 on a 4-core machine with 8GB
> > memory, you would start swapping when building Firefox.
> >
> > And even lto-partitions=8 is slower than the default of 32:
> >
> > (Firefox libxul build times with gcc-6.)
> >
> > --param=lto-partitions=8 -flto=4:
> > 1670.19s user 23.39s system 305% cpu 9:14.13 total
> >
> > default -flto=4:
> > 1668.94s user 32.51s system 320% cpu 8:50.36 total
> >
> > If someone wants fewer partitions he can use -flto-partition=one/none
> > or --param=lto-partitions=1.
>
> I know all this.  But then we seem to be stuck at 32 partitions from
> an input size of 32 * lto-partition-min up to 32 * lto-partition-max
> which is currently two orders of magnitude of difference in input size!
>
> That can't be a good heuristic.
>
> It's also about temporary disk space of which we use more the more
> partitions we use (because we essentially duplicate the whole global
> types/decls section for each partition).
>
> I'm not saying increasing lto-partition-min is the best solution but it
> certainly looks like the most appealing one to me.

I think the current lto-partition-min value of 10000 is reasonable, and
the proposed value of 50000 seems excessive.

Also see the comment in gcc/lto/lto-partition.c:

 428    We compute the expected size of a partition as:
 429
 430      max (total_size / lto_partitions, min_partition_size)
 431
 432    We use dynamic expected size of partition so small programs are 
partitioned
 433    into enough partitions to allow use of multiple CPUs, while large 
programs
 434    are not partitioned too much.  Creating too many partitions 
significantly
 435    increases the streaming overhead.
...
 442    The function implements a simple greedy algorithm.  Nodes are being 
added
 443    to the current partition until after 3/4 of the expected partition size 
is
 444    reached.  Past this threshold, we keep track of boundary size (number of
 445    edges going to other partitions) and continue adding functions until 
after
 446    the current partition has grown to twice the expected partition size,
        or is bigger than max_partition_size.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ : this sentence should be added.

 447    Then the process is undone to the point where the minimal ratio of 
boundary size
 448    and in-partition calls was reached.  */


--
Markus

Reply via email to