Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins

Kewen.Lin Tue, 04 Jun 2024 00:20:00 -0700

Hi,

on 2024/5/29 23:58, Carl Love wrote:
> Updated the patch per the feedback comments from the previous version.
> 
>                                  Carl 
> -------------------------------------------------------
> 
> rs6000, extend the current vec_{un,}signed{e,o} built-ins
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
> convert a vector of floats to signed/unsigned long long ints.  Extend the
> existing vec_{un,}signed{e,o} built-ins to handle the argument
> vector of floats to return the even/odd signed/unsigned integers.
> 
> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
> built-ins.
> 
> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
> now for internal use only. They are not documented and they do not
> have testcases.
> > The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
> vec_signed{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
> vec_unsigned{e,o}, remove.
> 
> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
> vec_unsigned, remove.
> 
> The __builtin_vsx_xvcvspuxws is redundante as it is covered by
> vec_unsigned, remove.


I perfer to move these removals into sub-patch 2/13 or split them out into
a new patch, since they don't match the subject of this patch.  Moving it
to sub-patch 2/13 looks good as they are all about vec_{un,}signed{,e,o}.

> 
> Add testcases and update documentation.
> 
> gcc/ChangeLog:
>       * config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
>       __builtin_vsx_xvcvspuxds_low): New built-in definitions.
>       (__builtin_vsx_xvcvspuxds): Fix return type.
>       (XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
>       VEC_VUNSIGNEDE_V4SF respectively.
>       (vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf,
>       vunsignede_v4sf respectively.
>       (__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws,
>       __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed.
>       * config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
>       vec_unsignede,vec_unsignedo):  Add new overloaded specifications.
>       * config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
>       vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
>       * doc/extend.texi (vec_signedo, vec_signede): Add documentation.
> 
> gcc/testsuite/ChangeLog:
>       * gcc.target/powerpc/builtins-3-runnable.c: New tests for the added
>       overloaded built-ins.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         | 25 ++----
>  gcc/config/rs6000/rs6000-overload.def         |  8 ++
>  gcc/config/rs6000/vsx.md                      | 88 +++++++++++++++++++
>  gcc/doc/extend.texi                           | 10 +++
>  .../gcc.target/powerpc/builtins-3-runnable.c  | 51 +++++++++--
>  5 files changed, 157 insertions(+), 25 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index bf9a0ae22fc..cea2649b86c 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1688,32 +1688,23 @@
>    const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>      XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>  
> -  const vsi __builtin_vsx_xvcvdpsxws (vd);
> -    XVCVDPSXWS vsx_xvcvdpsxws {}
> -
> -  const vsll __builtin_vsx_xvcvdpuxds (vd);
> -    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
> -
>    const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>      XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>  
> -  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
> -    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
> -
> -  const vsi __builtin_vsx_xvcvdpuxws (vd);
> -    XVCVDPUXWS vsx_xvcvdpuxws {}
> -
>    const vd __builtin_vsx_xvcvspdp (vf);
>      XVCVSPDP vsx_xvcvspdp {}
>  
>    const vsll __builtin_vsx_xvcvspsxds (vf);
> -    XVCVSPSXDS vsx_xvcvspsxds {}
> +    VEC_VSIGNEDE_V4SF vsignede_v4sf {}

We should rename __builtin_vsx_xvcvspsxds to
__builtin_vsx_vsignede_v4sf, one reason is to align with
the existing others, one more important thing
is that it doesn't generate 1-1 mapping xvcvspsxds,
putting that mnemonic can be misleading.

> +
> +  const vsll __builtin_vsx_xvcvspsxds_low (vf);

Ditto.

> +    VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
>  
> -  const vsll __builtin_vsx_xvcvspuxds (vf);> -    XVCVSPUXDS vsx_xvcvspuxds 
> {}
> +  const vull __builtin_vsx_xvcvspuxds (vf);

Ditto.

> +    VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>  
> -  const vsi __builtin_vsx_xvcvspuxws (vf);
> -    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
> +  const vull __builtin_vsx_xvcvspuxds_low (vf);

Ditto.

> +    VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>  
>    const vd __builtin_vsx_xvcvsxddp (vsll);
>      XVCVSXDDP vsx_floatv2div2df2 {}
> diff --git a/gcc/config/rs6000/rs6000-overload.def 
> b/gcc/config/rs6000/rs6000-overload.def
> index 84bd9ae6554..4d857bb1af3 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -3307,10 +3307,14 @@
>  [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>    vsi __builtin_vec_vsignede (vd);
>      VEC_VSIGNEDE_V2DF
> +  vsll __builtin_vec_vsignede (vf);
> +    VEC_VSIGNEDE_V4SF
>  
>  [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>    vsi __builtin_vec_vsignedo (vd);
>      VEC_VSIGNEDO_V2DF
> +  vsll __builtin_vec_vsignedo (vf);
> +    VEC_VSIGNEDO_V4SF
>  
>  [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>    vsi __builtin_vec_signexti (vsc);
> @@ -4433,10 +4437,14 @@
>  [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>    vui __builtin_vec_vunsignede (vd);
>      VEC_VUNSIGNEDE_V2DF
> +  vull __builtin_vec_vunsignede (vf);
> +    VEC_VUNSIGNEDE_V4SF
>  
>  [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>    vui __builtin_vec_vunsignedo (vd);
>      VEC_VUNSIGNEDO_V2DF
> +  vull __builtin_vec_vunsignedo (vf);
> +    VEC_VUNSIGNEDO_V4SF
>  
>  [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>    vui __builtin_vec_extract_exp (vf);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f135fa079bd..a8f3d459232 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds"
>    DONE;
>  })
>  
> +;; Convert low vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit signed

Maybe:

;; Convert float vector even elements to {un,}signed long long vector

> +(define_expand "vsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +       /* Shift left one word to put even word in correct location */
> +       rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +       rtx rtx_val = GEN_INT (4);
> +       emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +                                       rtx_val));
> +       emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
> +    }

I think this is wrong, even elements on BE is word 0 and 2, it doesn't
requires vector shifting (similar to doublee<mode>2), while LE needs.

> +  else
> +    emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +;; Convert high vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit signed

Ditto.

> +(define_expand "vsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));

As above, this is for odd elements, so BE needs vector shifting while LE 
doesn't.

The vunsigned* below need the according fixes.

> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +                                       rtx_val));
> +      emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
> +;; Convert low vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit unsigned integers.
> +(define_expand "vunsignede_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +                                       rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
> +    }
> +  else
> +    emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +;; Convert high vector elements of 32-bit floating point numbers to vector of
> +;; 64-bit unsigned integers.
> +(define_expand "vunsignedo_v4sf"
> +  [(match_operand:V2DI 0 "vsx_register_operand")
> +   (match_operand:V4SF 1 "vsx_register_operand")]
> +  "VECTOR_UNIT_VSX_P (V2DFmode)"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
> +  else
> +    {
> +      /* Shift left one word to put even word in correct location */
> +      rtx rtx_tmp = gen_reg_rtx (V4SFmode);
> +      rtx rtx_val = GEN_INT (4);
> +      emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
> +                                       rtx_val));
> +      emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
> +    }
> +
> +  DONE;
> +})
> +
>  ;; Generate float2 double
>  ;; convert two double to float
>  (define_expand "float2_v2df"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 267fccd1512..b88e61641a2 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22577,6 +22577,16 @@ if the VSX instruction set is available.  The 
> @samp{vec_vsx_ld} and
>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>  
> +@smallexample
> +vector signed signed long long vec_signedo (vector float);
> +vector signed signed long long vec_signede (vector float);
> +vector unsigned signed long long vec_signedo (vector float);
> +vector unsigned signed long long vec_signede (vector float);
> +@end smallexample

Nit: s/signed long/long/

BR,
Kewen

> +
> +The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
> +additional extensions to the built-ins as documented in the PVIPR.
> +
>  @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
>  @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> index 5dcdfbee791..557befc9a4a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
> @@ -3,7 +3,7 @@
>  /* { dg-options "-maltivec -mvsx" } */
>  
>  #include <altivec.h> // vector
> -
> +#define DEBUG 1
>  #ifdef DEBUG
>  #include <stdio.h>
>  #endif
> @@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned 
> int vec_result,
>  }
>  
>  void test_ll_int_result(vector long long int vec_result,
> -                     vector long long int vec_expected)
> +                     vector long long int vec_expected,
> +                     char *string)
>  {
>       int i;
>  
>       for (i = 0; i < 2; i++)
>               if (vec_result[i] != vec_expected[i]) {
>  #ifdef DEBUG
> -                     printf("Test_ll_int_result: ");
> +                     printf("Test_ll_int_result %s: ", string);
>                       printf("vec_result[%d] (%lld) != vec_expected[%d] 
> (%lld)\n",
>                              i, vec_result[i], i, vec_expected[i]);
>  #else
> @@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
>  }
>  
>  void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
> -                              vector long long unsigned int vec_expected)
> +                              vector long long unsigned int vec_expected,
> +                              char *string)
>  {
>       int i;
>  
>       for (i = 0; i < 2; i++)
>               if (vec_result[i] != vec_expected[i]) {
>  #ifdef DEBUG
> -                     printf("Test_ll_unsigned_int_result: ");
> +                     printf("Test_ll_unsigned_int_result %s: ", string);
>                       printf("vec_result[%d] (%lld) != vec_expected[%d] 
> (%lld)\n",
>                              i, vec_result[i], i, vec_expected[i]);
>  #else
> @@ -292,7 +294,8 @@ int main()
>       vec_dble0 = (vector double){-124.930, 81234.49};
>       vec_ll_int_expected = (vector long long signed int){-124, 81234};
>       vec_ll_int_result = vec_signed (vec_dble0);
> -     test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
> +     test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +                         "vec_signed");
>  
>       /* Convert double precision vector float to vector int, even words */
>       vec_dble0 = (vector double){-124.930, 81234.49};
> @@ -321,12 +324,44 @@ int main()
>       test_unsigned_int_result (ALL, vec_uns_int_result,
>                                 vec_uns_int_expected);
>  
> +     /* Convert single precision vector float, even args, to vector
> +        signed long long int.  */
> +     vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +     vec_ll_int_expected = (vector signed long long int){834, -5};
> +     vec_ll_int_result = vec_signede (vec_flt0);
> +     test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +                         "vec_signede");
> +
> +     /* Convert single precision vector float, odd args, to vector
> +        signed long long int.  */
> +     vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +     vec_ll_int_expected = (vector signed long long int){14, -3};
> +     vec_ll_int_result = vec_signedo (vec_flt0);
> +     test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
> +                         "vec_signedo");
> +
> +     /* Convert single precision vector float, even args, to vector
> +        unsigned long long int.  */
> +     vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +     vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
> +     vec_ll_uns_int_result = vec_unsignede (vec_flt0);
> +     test_ll_unsigned_int_result (vec_ll_uns_int_result,
> +                                  vec_ll_uns_int_expected, "vec_unsignede");
> +
> +     /* Convert single precision vector float, odd args, to vector
> +        unsigned long long int.  */
> +     vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
> +     vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
> +     vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
> +     test_ll_unsigned_int_result (vec_ll_uns_int_result,
> +                                  vec_ll_uns_int_expected, "vec_unsignedo");
> +
>       /* Convert double precision float to long long unsigned int */
>       vec_dble0 = (vector double){124.930, 8134.49};
>       vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
>       vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>       test_ll_unsigned_int_result (vec_ll_uns_int_result,
> -                                  vec_ll_uns_int_expected);
> +                                  vec_ll_uns_int_expected, "vec_unsigned");
>  
>       /* Convert double precision float to long long unsigned int. Negative
>          arguments.  */
> @@ -334,7 +369,7 @@ int main()
>       vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
>       vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>       test_ll_unsigned_int_result (vec_ll_uns_int_result,
> -                                  vec_ll_uns_int_expected);
> +                                  vec_ll_uns_int_expected, "vec_unsigned");
>  
>       /* Convert double precision vector float to vector unsigned int,
>          even words.  Negative arguments */

Re: [PATCH 4/13 ver 3] rs6000, extend the current vec_{un,}signed{e,o} built-ins

Reply via email to