On Wed, Mar 02, 2022 at 06:47:39PM -0600, Segher Boessenkool wrote: > On Wed, Mar 02, 2022 at 03:54:29PM -0500, Michael Meissner wrote: > > Optimize signed DImode -> TImode on power10. > > > On power10, GCC tries to optimize the signed conversion from DImode to > > TImode by using the vextsd2q instruction. However to generate this > > instruction, it would have to generate 3 direct moves (1 from the GPR > > registers to the altivec registers, and 2 from the altivec registers to > > the GPR register). > > > > This patch generates the shift right immediate instruction to do the > > conversion if the target/source registers ares GPR registers like it does > > on earlier systems. If the target/source registers are Altivec registers, > > it will generate the vextsd2q instruction. > > > PR target/104698 > > * config/rs6000/vsx.md (mtvsrdd_diti_w1): Delete. > > (extendditi2): Convert from define_expand to > > define_insn_and_split. Replace with code to deal with both GPR > > registers and with altivec registers. > > > > gcc/testsuite/ > > PR target/104698 > > * gcc.target/powerpc/pr104698-1.c: New test. > > * gcc.target/powerpc/pr104698-2.c: New test. > > > +;; Sign extend DI to TI. We provide both GPR targets and Altivec targets > > on > > +;; power10. On earlier systems, the machine independent code will > > generate a > > +;; shift left to sign extend the 64-bit value to 128-bit. > > +;; > > +;; If the register allocator prefers to use GPR registers, we will use a > > shift > > +;; left instruction to sign extend the 64-bit value to 128-bit. > > +;; > > +;; If the register allocator prefers to use Altivec registers on power10, > > +;; generate the vextsd2q instruction. > > +(define_insn_and_split "extendditi2" > > + [(set (match_operand:TI 0 "register_operand" "=r,r,v,v,v") > > + (sign_extend:TI (match_operand:DI 1 "input_operand" "r,m,r,wa,Z"))) > > + (clobber (reg:DI CA_REGNO))] > > + "TARGET_POWERPC64 && TARGET_POWER10" > > What happens with -m32 -m{no,}-powerpc64?
The __int128_t and __uint128_t types are not defined in 32-bit. So you would never get a DImode to TImode conversion. > > + "#" > > + "&& reload_completed" > > + [(pc)] > > +{ > > + rtx dest = operands[0]; > > + rtx src = operands[1]; > > + int dest_regno = reg_or_subregno (dest); > > + > > + /* Handle conversion to GPR registers. Load up the low part and then do > > + a sign extension to the upper part. */ > > + if (INT_REGNO_P (dest_regno)) > > + { > > + rtx dest_hi = gen_highpart (DImode, dest); > > + rtx dest_lo = gen_lowpart (DImode, dest); > > + > > + emit_move_insn (dest_lo, src); > > + emit_insn (gen_ashrdi3 (dest_hi, dest_lo, GEN_INT (63))); > > Please use src instead of dest_lo. This always works, because you did > the low-part move first. Ok. > > + DONE; > > + } > > + > > + /* For conversion to an Altivec register, generate either a splat > > operation > > + or a load rightmost double word instruction. Both instructions gets > > the > > + DImode value into the lower 64 bits, and then do the vextsd2q > > + instruction. */ > > + > > (trailing whitespace) Ok. > > + else if (ALTIVEC_REGNO_P (dest_regno)) > > + { > > + if (MEM_P (src)) > > + emit_insn (gen_vsx_lxvrdx (dest, src)); > > + else > > + { > > + rtx dest_v2di = gen_rtx_REG (V2DImode, dest_regno); > > + emit_insn (gen_vsx_splat_v2di (dest_v2di, src)); > > + } > > + > > + emit_insn (gen_extendditi2_vector (dest, dest)); > > + DONE; > > + } > > This patch needs testing on BE (and 32-bit as well of course). Will do, but for 32-bit, it will be a NOP. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com