https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #8 from Peter Bergner <bergner at gcc dot gnu.org> ---
So Vlad is hesitant (probably rightly :) on accepting my patch.  Looking
closer, on BE, lower subreg2 is able to break the TImode accesses into 2 DImode
accesses which helps tremendously.  On LE (power8), split1 runs just before
lower subreg2 and inserts swaps on the memory accesses, which confuses lower
subreg, so we keep the TImode accesses and we get register pairs which are hard
to allocate and leads to poor decisions in this particular case.

As a hack, I moved lower subreg2 before split1 and we get the code we want.  I
don't think want to do that for real, so I will look at enhancing lower subreg
to recognize our TImode memory ops with swaps to see whether we can still
decompose them.

Reply via email to