Hi Haochen, on 2023/2/20 10:04, HAO CHEN GUI wrote: > Hi, > This patch merges two "vsldoi" insns when their sources are the > same. Particularly, it is simplified to be one move if the total > shift is multiples of 16 bytes. > > Bootstrapped and tested on powerpc64-linux BE and LE with no > regressions. > > Thanks > Gui Haochen > > > ChangeLog > 2023-02-20 Haochen Gui <guih...@linux.ibm.com> > > gcc/ > * config/rs6000/altivec.md (*altivec_vsldoi_dup_<mode>): New > insn_and_split to merge two vsldoi. > > gcc/testsuite/ > * gcc.target/powerpc/vsldoi_merge.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index 84660073f32..22e9c4c1fc5 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -2529,6 +2529,35 @@ (define_insn "altivec_vsldoi_<mode>" > "vsldoi %0,%1,%2,%3" > [(set_attr "type" "vecperm")]) > > +(define_insn_and_split "*altivec_vsldoi_dup_<mode>" > + [(set (match_operand:VM 0 "register_operand" "=v") > + (unspec:VM [(unspec:VM [(match_operand:VM 1 "register_operand" "v") > + (match_operand:VM 2 "register_operand" "v") > + (match_operand:QI 3 "immediate_operand" "i")] > + UNSPEC_VSLDOI) > + (unspec:VM [(match_dup 1) > + (match_dup 2) > + (match_dup 3)] > + UNSPEC_VSLDOI) > + (match_operand:QI 4 "immediate_operand" "i")] > + UNSPEC_VSLDOI))] > + "TARGET_ALTIVEC" > + "#" > + "&& 1" > + [(const_int 0)] > +{ > + unsigned int shift1 = UINTVAL (operands[3]); > + unsigned int shift2 = UINTVAL (operands[4]); > + > + unsigned int shift = (shift1 + shift2) % 16; > + if (shift) > + emit_insn (gen_altivec_vsldoi_<mode> (operands[0], operands[1], > + operands[1], GEN_INT (shift))); > + else > + emit_move_insn (operands[0], operands[1]); > + DONE; > +})
This patch looks wrong, I think we need to ensure operand 1 and operand 2 are the same (dup 1 for 2)? one simple counter example for this proposed fix is that two given vector a {A0, A1} and {B0, B1} (all A0/A1/B0/B1 are doublewords) on BE: a = vec_sld (a, b, 8); // (1) res a = {A1, B0} a = vec_sld (a, a, 8); // (2) res a = {B0, A1} it would get the unexpected result a {A0, A1} with this patch. Since this patch got bootstrapped and regress-tested, I think we don't have enough coverage on this part, it's a good thing to add one dg-do run test case as well. :) BR, Kewen