Hi Jeff,

on 2022/12/12 09:44, Jiufu Guo via Gcc-patches wrote:
> Hi,
> 
> Compare with previous patch, this patch updates accoding to comments; fixes
> conflicts with trunk, and recheck bootstrap&regtest.
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607333.html
> 
> For a complicate 64bit constant, blow is one instruction-sequence to
                                   ~~~~ below

> build:
>       lis 9,0x800a
>       ori 9,9,0xabcd
>       sldi 9,9,32
>       oris 9,9,0xc167
>       ori 9,9,0xfa16
> 
> while we can also use below sequence to build:
>       lis 9,0xc167
>       lis 10,0x800a
>       ori 9,9,0xfa16
>       ori 10,10,0xabcd
>       rldimi 9,10,32,0
> This sequence is using 2 registers to build high and low part firstly,
> and then merge them.
> 
> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more
> register with potential register pressure).
> 
> The instruction sequence with two registers for parallel version can be
> generated only if can_create_pseudo_p.  Otherwise, the one register
> sequence is generated.
> 
> Bootstrap and regtest pass on ppc64{,le}.
> Is this ok for trunk?
> 
> 
> BR,
> Jeff(Jiufu)
> 
> 
> gcc/ChangeLog:
> 
>       * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Generate
>       more parallel code if can_create_pseudo_p.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/powerpc/parall_5insn_const.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc                   | 37 +++++++++++++------
>  .../gcc.target/powerpc/parall_5insn_const.c   | 27 ++++++++++++++
>  2 files changed, 52 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index b3a609f3aa3..3020d9780bc 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10322,19 +10322,32 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>      }
>    else
>      {
> -      temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> -
> -      emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32)));
> -      if (ud3 != 0)
> -     emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud3)));
> +      if (can_create_pseudo_p ())
> +     {
> +       /* lis H,U4; ori H,U3; lis L,U2; ori L,U1; rldimi L,H,32,0.  */

Nit: It's probably better to update the capitals with the actual variable names
in upper case, since they are also short, ...

> +       rtx high = gen_reg_rtx (DImode);
> +       rtx low = gen_reg_rtx (DImode);
> +       HOST_WIDE_INT num = (ud2 << 16) | ud1;
> +       rs6000_emit_set_long_const (low, sext_hwi (num, 32));
> +       num = (ud4 << 16) | ud3;
> +       rs6000_emit_set_long_const (high, sext_hwi (num, 32));
> +       emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low,
> +                                        GEN_INT (0xffffffff)));
> +     }
> +      else
> +     {
> +       /* lis A,U4; ori A,U3; rotl A,32; oris A,U2; ori A,U1.  */

... and here, the others look good to me.  Thanks!

BR,
Kewen

Reply via email to