On 12/08/10 01:40, Frederic Riss wrote:
On 8 December 2010 00:12, Jeff Law<l...@redhat.com> wrote:
On 12/07/10 12:29, Frédéric RISS wrote:
Le mardi 07 décembre 2010 à 06:18 -0700, Jeff Law a écrit :
On 12/06/10 15:07, Ian Lance Taylor wrote:
Given the two loads don't have a def-use data dependency combine won't
ever get the opportunity to do anything with them. In general there is
no pass which combines insns without a true data dependency and targets
which have such insns have had to handle those combinations in machine
dependent reorg. In fact, it was the combination of independent insns
which led to the introduction of the machine dependent reorg pass eons
ago.
The issue with this approach is that reorg runs very late. I suppose
that if one wants to combine 2 SI loads into a DI load, it needs to be
done before IRA to satisfy the generated register constraints.
Constraints aren't checked until after register allocation is complete --
they're going to be of no help in performing this optimization. Right now
the machine dependent reorg pass or a peephole are the only places this
optimization can be performed. However, I believe it would be possible to
make the scheduler perform this optimization with some work.
Sorry, I think I wasn't clear. I didn't mean constraints in term on
RTL template constraints, but 'constraints' coming from the new DI
destination of the load. More specifically: 2 SI loads can target
totally independent registers whereas a standard DI load must target a
contiguous SI register pair. If you don't do that before IRA, it will
most likely be impossible to do cleanly, won't it?
I tend to look at it the other way -- prior to allocation & reload
you're going to have two SImode pseudos and there's no way to guarantee
they'll end up in consecutive hard registers. You'd have to create a
new DImode pseudo as the destination of the memory load, then copy from
the DImode pseudo into the two SImode pseudos and rely on the register
allocator to allocate the DImode pseudo to the same hard registers as
the two SImode pseudos. There's no guarantee that'll happen (it often
will, but in the cases where it doesn't you end up with useless copies).
With that in mind, I tend to see the right way to address this
optimization as an optimization which runs *after* register allocation
and reloading where we know the precise set of registers used and thus
can determine if two SImode loads target a pair of consecutive registers
and thus are potential candidates for merging the SImode loads into a
DImode load. The difficulty here is the data dependency analysis, thus
my suggestion that the scheduler's dependency analysis be used to drive
this optimization.
jeff
Fred