Hi Jeff, on 2022/12/12 09:44, Jiufu Guo via Gcc-patches wrote: > Hi, > > Compare with previous patch, this patch updates accoding to comments; fixes > conflicts with trunk, and recheck bootstrap®test. > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607333.html > > For a complicate 64bit constant, blow is one instruction-sequence to ~~~~ below
> build: > lis 9,0x800a > ori 9,9,0xabcd > sldi 9,9,32 > oris 9,9,0xc167 > ori 9,9,0xfa16 > > while we can also use below sequence to build: > lis 9,0xc167 > lis 10,0x800a > ori 9,9,0xfa16 > ori 10,10,0xabcd > rldimi 9,10,32,0 > This sequence is using 2 registers to build high and low part firstly, > and then merge them. > > In parallel aspect, this sequence would be faster. (Ofcause, using 1 more > register with potential register pressure). > > The instruction sequence with two registers for parallel version can be > generated only if can_create_pseudo_p. Otherwise, the one register > sequence is generated. > > Bootstrap and regtest pass on ppc64{,le}. > Is this ok for trunk? > > > BR, > Jeff(Jiufu) > > > gcc/ChangeLog: > > * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Generate > more parallel code if can_create_pseudo_p. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/parall_5insn_const.c: New test. > > --- > gcc/config/rs6000/rs6000.cc | 37 +++++++++++++------ > .../gcc.target/powerpc/parall_5insn_const.c | 27 ++++++++++++++ > 2 files changed, 52 insertions(+), 12 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index b3a609f3aa3..3020d9780bc 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -10322,19 +10322,32 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT > c) > } > else > { > - temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode); > - > - emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32))); > - if (ud3 != 0) > - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud3))); > + if (can_create_pseudo_p ()) > + { > + /* lis H,U4; ori H,U3; lis L,U2; ori L,U1; rldimi L,H,32,0. */ Nit: It's probably better to update the capitals with the actual variable names in upper case, since they are also short, ... > + rtx high = gen_reg_rtx (DImode); > + rtx low = gen_reg_rtx (DImode); > + HOST_WIDE_INT num = (ud2 << 16) | ud1; > + rs6000_emit_set_long_const (low, sext_hwi (num, 32)); > + num = (ud4 << 16) | ud3; > + rs6000_emit_set_long_const (high, sext_hwi (num, 32)); > + emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low, > + GEN_INT (0xffffffff))); > + } > + else > + { > + /* lis A,U4; ori A,U3; rotl A,32; oris A,U2; ori A,U1. */ ... and here, the others look good to me. Thanks! BR, Kewen