On 12/08/10 01:40, Frederic Riss wrote:
On 8 December 2010 00:12, Jeff Law<l...@redhat.com>  wrote:
On 12/07/10 12:29, Frédéric RISS wrote:
Le mardi 07 décembre 2010 à 06:18 -0700, Jeff Law a écrit :
On 12/06/10 15:07, Ian Lance Taylor wrote:
Given the two loads don't have a def-use data dependency combine won't
ever get the opportunity to do anything with them.  In general there is
no pass which combines insns without a true data dependency and targets
which have such insns have had to handle those combinations in machine
dependent reorg.  In fact, it was the combination of independent insns
which led to the introduction of the machine dependent reorg pass eons
ago.
The issue with this approach is that reorg runs very late. I suppose
that if one wants to combine 2 SI loads into a DI load, it needs to be
done before IRA to satisfy the generated register constraints.
Constraints aren't checked until after register allocation is complete --
they're going to be of no help in performing this optimization.  Right now
the machine dependent reorg pass or a peephole are the only places this
optimization can be performed.    However, I believe it would be possible to
make the scheduler perform this optimization with some work.
Sorry, I think I wasn't clear. I didn't mean constraints in term on
RTL template constraints, but 'constraints' coming from the new DI
destination of the load. More specifically: 2 SI loads can target
totally independent registers whereas a standard DI load must target a
contiguous SI register pair. If you don't do that before IRA, it will
most likely be impossible to do cleanly, won't it?
I tend to look at it the other way -- prior to allocation & reload you're going to have two SImode pseudos and there's no way to guarantee they'll end up in consecutive hard registers. You'd have to create a new DImode pseudo as the destination of the memory load, then copy from the DImode pseudo into the two SImode pseudos and rely on the register allocator to allocate the DImode pseudo to the same hard registers as the two SImode pseudos. There's no guarantee that'll happen (it often will, but in the cases where it doesn't you end up with useless copies).

With that in mind, I tend to see the right way to address this optimization as an optimization which runs *after* register allocation and reloading where we know the precise set of registers used and thus can determine if two SImode loads target a pair of consecutive registers and thus are potential candidates for merging the SImode loads into a DImode load. The difficulty here is the data dependency analysis, thus my suggestion that the scheduler's dependency analysis be used to drive this optimization.

jeff
Fred

Reply via email to