Hi Haochen,

on 2023/2/20 10:04, HAO CHEN GUI wrote:
> Hi,
>   This patch merges two "vsldoi" insns when their sources are the
> same. Particularly, it is simplified to be one move if the total
> shift is multiples of 16 bytes.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no
> regressions.
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> 2023-02-20  Haochen Gui <guih...@linux.ibm.com>
> 
> gcc/
>       * config/rs6000/altivec.md (*altivec_vsldoi_dup_<mode>): New
>       insn_and_split to merge two vsldoi.
> 
> gcc/testsuite/
>       * gcc.target/powerpc/vsldoi_merge.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 84660073f32..22e9c4c1fc5 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2529,6 +2529,35 @@ (define_insn "altivec_vsldoi_<mode>"
>    "vsldoi %0,%1,%2,%3"
>    [(set_attr "type" "vecperm")])
> 
> +(define_insn_and_split "*altivec_vsldoi_dup_<mode>"
> +  [(set (match_operand:VM 0 "register_operand" "=v")
> +     (unspec:VM [(unspec:VM [(match_operand:VM 1 "register_operand" "v")
> +                             (match_operand:VM 2 "register_operand" "v")
> +                             (match_operand:QI 3 "immediate_operand" "i")]
> +                            UNSPEC_VSLDOI)
> +                 (unspec:VM [(match_dup 1)
> +                             (match_dup 2)
> +                             (match_dup 3)]
> +                            UNSPEC_VSLDOI)
> +                 (match_operand:QI 4 "immediate_operand" "i")]
> +                UNSPEC_VSLDOI))]
> +  "TARGET_ALTIVEC"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +{
> +  unsigned int shift1 = UINTVAL (operands[3]);
> +  unsigned int shift2 = UINTVAL (operands[4]);
> +
> +  unsigned int shift = (shift1 + shift2) % 16;
> +  if (shift)
> +    emit_insn (gen_altivec_vsldoi_<mode> (operands[0], operands[1],
> +                                       operands[1], GEN_INT (shift)));
> +  else
> +    emit_move_insn (operands[0], operands[1]);
> +  DONE;
> +})

This patch looks wrong, I think we need to ensure operand 1 and operand 2
are the same (dup 1 for 2)? one simple counter example for this proposed
fix is that two given vector a {A0, A1} and {B0, B1} (all A0/A1/B0/B1 are
doublewords) on BE:

a = vec_sld (a, b, 8);  // (1) res a = {A1, B0}
a = vec_sld (a, a, 8);  // (2) res a = {B0, A1}

it would get the unexpected result a {A0, A1} with this patch.

Since this patch got bootstrapped and regress-tested, I think we don't
have enough coverage on this part, it's a good thing to add one dg-do run
test case as well. :)

BR,
Kewen

Reply via email to