> On Thu, 14 Apr 2016, Ramana Radhakrishnan wrote:
> 
> > 
> > > 
> > > What happens in practice?  GCC doesn't put functions in random
> > > partitions.
> > > 
> > 
> > The data goes into a separate partition AFAIU - it means that all data 
> > accesses are as though they are extern references which means there's 
> > not necessarily any CSE'ing ability that's available with section 
> > anchors.
> 
> No, they are added to partitions referencing them.  Actually they
> end up in the first partition that references them.

Yeah, this can be easily tuned to distribute variables last and put them
to partitioning referncing them most often.

Balanced partitioning tries to preserve source order (unless FDO tells 
otehrwise)
and thus we could add anchros into the cost metrics it uses to split partitions.
> 
> > >> If it's not desired by default could we gate it on an option ?
> > >> AFIAU, section anchors optimization is important for ARM and AArch64.
> > > 
> > > For code size or for performance?  I wonder why section anchors cannot
> > > be "implemented" using some special relocations and thus as linker
> > > optimization.
> > 
> > For performance (and probably code size too) as you can end up CSE'ing 
> > the anchor point. the difference in performance with -flto-partitions=1 
> > is visible on quite a few of the spec2k benchmarks. I don't remember 
> > which ones immediately but yeah it makes a difference.
> 
> Yeah, as said elsewhere for things like spec2k we should have been
> less aggressive with the goal to parallelize and build larger
> partitions for small programs.  Thus increase lto-min-partition
> up to a point where all benchmarks end up in a single partition
> (small benchmarks, so spec2k6 doesn't count at all).  After all
> our inlining limits only start with large-unit-insns as well,
> which is 10 times of lto-min-partition...

Well, I don't think large-unit-insns is particularly related to
lto-min-partition. But lto-min-partition was not really tuned at all
(I remember I dumped the value for some smal spec2k benchmark and based
it on that).  We could do some experiments. But for larger programs we
want more stable solution. Read/write globals are not that common,
but readonly globals are common in modern and large codebases, too.

Posibility is to assign achors at WPA time disabling any chance for
variables to be optimized out later. 

Honza
> 
> Richard.

Reply via email to