On 03/09/15 05:40, Richard Biener wrote:
On Sun, Mar 8, 2015 at 8:49 PM, Jeff Law <l...@redhat.com> wrote:
On 03/08/15 12:13, Richard Biener wrote:
I see. This basically creates two loop latches and thus will make our
loop detection code turn the loop into a fake loop nest. Not sure if that
is a good idea.
I'd have to sit down and ponder this for a while -- what would be the
register pressure implications of duplicating the contents of the join block
into its predecessors, leaving an empty joiner so that we still have a
single latch?
Good question. Now another question is why we don't choose this
way to disambiguate loops with multiple latches? (create a forwarder
as new single latch block)
Dunno. The RTL jump optimizer ought to eliminate the unnecessary
jumping late in the optimization pipeline and creating the forwarder
allows us to put the loop into a "more canonical" form for the loop
optimizers. Seems like it'd be worth some experimentation.
I'm having trouble seeing how Ajit's proposal helps reduce register
pressure in any significant way except perhaps by exposing partially
dead code. And if that's the primary benefit, we may better off
actually building a proper framework for PDCE.
I've often pondered if PDCE is really worth it. There's some good PLDI
papers from many years ago. One is closely related to our RTL GCSE
implementation (Knoop). But RTL seems the wrong place to be doing this.
Click's GCM/GVN can achieve similar results by the nature of code motion
algorithm IIRC, but as much as I like Click's algorithm, I wasn't ever
able to get it to do anything significantly better than existing bits in
GCC.
Jeff