On Mon, Apr 6, 2015 at 2:19 PM, Bill Schmidt
<wschm...@linux.vnet.ibm.com> wrote:
> Hi,
>
> The prologue and epilogue code to save/restore Altivec registers uses
> the generic emit_move_insn logic.  This means that when VSX is available
> on a little-endian target, we will generate xxswapd/stxvd2x for prologue
> saves, and lxvd2x/xxswapd for epilogue restores.  This happens too late
> to be cleaned up by swap optimization.  Since the stack save slots are
> aligned, we should always use lvx and stvx for this purpose instead.
> This improves performance on LE targets, is performance-neutral for BE,
> and is always safe.
>
> This change causes the previously failing test
> gcc.target/powerpc/swaps-p8-2.c to succeed again.  This test started
> failing when an unrelated change raised register pressure on the vector
> registers, causing us to generate the above save/restore sequences.  The
> test was testing whether all xxswapd instructions had been removed.
> With this change, we no longer generate them for the save/restores, and
> the test succeeds.
>
> Because we no longer generate the two-instruction sequences, we no
> longer need the code in rs6000_frame_related to identify the true source
> register for the two-instruction store.  I've removed that along with
> the extra argument it required, and adjusted the callers.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions, and one fixed failure.  Is this ok for trunk after 5.1
> branches?  I'd also like to backport this as far as 4.8 for the
> performance improvement.
>
> Thanks,
> Bill
>
>
> 2015-04-06  Bill Schmidt  <wschm...@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (*altivec_lvx_<mode>_internal): Remove
>         asterisk from name so this can be generated directly.
>         (*altivec_stvx_<mode>_internal): Likewise.
>         * config/rs6000/rs6000.c (rs6000_frame_related): Remove split_reg
>         argument and logic that references it.
>         (emit_frame_save): Remove last parameter from call to
>         rs6000_frame_related.
>         (rs6000_emit_prologue): Remove last parameter from eight calls to
>         rs6000_frame_related.  Force generation of stvx instruction for
>         Altivec register saves.  Remove split_reg handling, which is no
>         longer needed.
>         (rs6000_emit_epilogue):  Force generation of lvx instruction for
>         Altivec register restores.

This is okay after GCC 5 branches.  But please insert a space between
(void) and emit_insn when casting the result.

> @@ -24817,7 +24805,10 @@ rs6000_emit_epilogue (int sibcall)
>                 mem = gen_frame_mem (V4SImode, addr);
>
>                 reg = gen_rtx_REG (V4SImode, i);
> -               emit_move_insn (reg, mem);
> +               /* Rather than emitting a generic move, force use of the
> +                  lvx instruction, which we always want.  In particular
> +                  we don't want lxvd2x/xxpermdi for little endian.  */
> +               (void)emit_insn (gen_altivec_lvx_v4si_internal (reg, mem));
>               }
>         }
>
> @@ -25020,7 +25011,10 @@ rs6000_emit_epilogue (int sibcall)
>                 mem = gen_frame_mem (V4SImode, addr);
>
>                 reg = gen_rtx_REG (V4SImode, i);
> -               emit_move_insn (reg, mem);
> +               /* Rather than emitting a generic move, force use of the
> +                  lvx instruction, which we always want.  In particular
> +                  we don't want lxvd2x/xxpermdi for little endian.  */
> +               (void)emit_insn (gen_altivec_lvx_v4si_internal (reg, mem));
>               }
>         }

Thanks, David

Reply via email to