On Thu, Jan 11, 2018 at 03:35:37PM +0000, Wilco Dijkstra wrote:
> Segher Boessenkool wrote:
>   
> > Of course I see that ldp is useful.  I don't think that this particular
> > way of forcing more pairs is a good idea.  Needs testing / benchmarking /
> > instrumentation, and we haven't seen any of that.
> 
> I wouldn't propose a patch if it caused slowdowns. In fact I am seeing 
> speedups,
> particularly benchmarks which benefit from shrinkwrapping (eg. povray). Int is
> flat, and there is an overall gain on fp.
> 
> > Forcing pairs before separate shrink-wrapping reduces the effectiveness
> > of the latter by a lot.
> 
> That doesn't appear to be the case - or at least any reduction in 
> effectiveness is
> more than mitigated by the lower number of memory accesses and I-cache misses.
> To do better still you'd need to compute an optimal set of pairs, and that is 
> quite
> difficult in the current infrastructure. I could dynamically create pairs 
> just in the backend
> but that won't be optimal either.

Right, I certainly believe forming more pairs before sws (as you do)
improves the code -- but I think forming the pairs only after sws will
work even better.

But yes that is more work to implement, and the benefit (if any) is
unknown.  I hoped I could convince you to try ;-)


Segher

Reply via email to