Hi Carl,

on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> The following patch fixes errors in the arguments in the
> __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx builtin
> definitions.  Note, these builtins are used by the overloaded
> __builtin_vec_xst_trunc builtin.
> 
> The patch adds a new overloaded builtin definition for
> __builtin_vec_xst_trunc for the third argument to be unsigned and
> signed long int.
> 
> A new testcase is added for the various overloaded versions of
> __builtin_vec_xst_trunc.
> 
> The patch has been tested on Power 10 with no new regressions.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                     Carl
> 
> -------------------------------------------
> rs6000: Fix __builtin_vec_xst_trunc definition
> 
> Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
> and __builtin_altivec_tr_stxvrwx to handle the short and word cases.  The
> arguments for these two builtins are wrong.  This patch fixes the wrong
> arguments for the builtins.
> 
> Additionally, the patch adds a new __builtin_vec_xst_trunc overloaded
> version for the destination being signed or unsigned long int.
> 
> A runnable test case is added to test each of the overloaded definitions
> of __builtin_vec_xst_tru
> 
> gcc/
>       * config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
>       __builtin_altivec_tr_stxvrwx): Fix type of second argument.
>       Add, definition for send argument to be signed long.
>       * config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
>       add definition with thrird arument signed and unsigned long.
>       * doc/extend.texi (__builtin_vec_xst_trunc): Add documentation for
>       new unsinged long and signed long versions.
> 
> gcc/testsuite/
>       * gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
>       for __builtin_vec_xst_trunc builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   7 +-
>  gcc/config/rs6000/rs6000-overload.def         |   4 +
>  gcc/doc/extend.texi                           |   2 +
>  .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217 ++++++++++++++++++
>  4 files changed, 228 insertions(+), 2 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..a378491b358 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3161,12 +3161,15 @@
>    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>      TR_STXVRBX vsx_stxvrbx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>      TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>      TR_STXVRWX vsx_stxvrwx {stvec}

Good catching!

>  
> +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long *);
> +    TR_STXVRLX vsx_stxvrdx {stvec}
> +

This is mapped to the one used for type long long, it's a hard mapping,
IMHO it's wrong and not consistent with what the users expect, since on Power
the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, this
implementation binding to 8 bytes can cause trouble in 32-bit.  I wonder if
it's a good idea to add one overloaded version for type long int, for now
openxl also emits error message for long int type pointer (see its doc [1]),
users can use casting to make it to the acceptable pointer types (long long
or int as its size).

[1] 
https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc


>    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
>      TR_STXVRDX vsx_stxvrdx {stvec}
>  
> diff --git a/gcc/config/rs6000/rs6000-overload.def 
> b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..54b7ae5e51b 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -4872,6 +4872,10 @@
>      TR_STXVRWX  TR_STXVRWX_S
>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
>      TR_STXVRWX  TR_STXVRWX_U
> +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long *);
> +    TR_STXVRLX  TR_STXVRLX_S
> +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long *);
> +    TR_STXVRLX  TR_STXVRLX_U
>    void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
>      TR_STXVRDX  TR_STXVRDX_S
>    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d8..7e2ae790ab3 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18570,10 +18570,12 @@ instructions.
>  @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed long long, 
> signed char *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, 
> signed short *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, 
> signed int *)}
> +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, 
> signed long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed long long, 
> signed long long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long 
> long, unsigned char *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long 
> long, unsigned short *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long 
> long, unsigned int *)}
> +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long 
> long, unsigned long *)}
>  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, signed long 
> long, unsigned long long *)}
>  
>  Truncate and store the rightmost element of a vector, as if implemented by 
> the
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c 
> b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> new file mode 100644
> index 00000000000..7108109560d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> @@ -0,0 +1,217 @@
> +/* Test of __builtin_vec_xst_trunc  */
> +
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-require-effective-target power10_ok } */

Normally _hw is stricter than _ok, you already check power10_hw,
I think power10_ok is useless here.

> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +
> +#include <altivec.h>
> +#include <stdio.h>
> +#include <inttypes.h>
> +#include <string.h>
> +#include <stdlib.h>
> +
> +#define DEBUG 0
> +#define TRUE  1
> +#define FALSE 0
> +#define SIZE  4
> +
> +vector signed __int128 zero_vsint128 = {0x0};
> +
> +vector signed __int128 store_data[SIZE] = {
> +{ (__int128) 0x79BD000000000000 << 64 | (__int128) 0x123456789abcdef8ULL},
> +{ (__int128) 0x8ACE000000000000 << 64 | (__int128) 0xfedcba9876543217ULL},
> +{ (__int128) 0x1357000000000000 << 64 | (__int128) 0xccccccccccccccccULL},
> +{ (__int128) 0xf000000000000000 << 64 | (__int128) 0xaaaaaaaaaaaaaaaaULL}
> +};
> +
> +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC, 0xAA};
> +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217, 0xcccc, 0xaaaa, 
> };
> +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217, 0xCCCCCCCC,
> +                                     0xAAAAAAAA};
> +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
> +                                                     0xFEDCBA9876543217ULL,
> +                                                     0xCCCCCCCCCCCCCCCCULL,
> +                                                     0xAAAAAAAAAAAAAAAAULL};
> +signed long long int signed_long_long_expected[SIZE] = 
> {0x123456789ABCDEF8ULL,
> +                                                     0xFEDCBA9876543217ULL,
> +                                                     0xCCCCCCCCCCCCCCCCULL,
> +                                                     0xAAAAAAAAAAAAAAAAULL};
> +
> +union conv_t {
> +  vector signed __int128 vsi128;
> +  unsigned long long ull[2];
> +  signed char schar[16];
> +  signed __int128 s128;
> +} conv;
> +
> +int check_expected_byte (signed char expected,
> +                      signed char actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected half values don't match. \n");
> +      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
> +           expected & 0xFF, actual & 0xFF);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_half (signed short int expected,
> +                      signed short int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected short values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +           expected & 0xFFFF, actual & 0xFFFF);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_int (signed int expected,
> +                     signed int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected int values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +           expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_long (signed long int expected,
> +                      signed long int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected long values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +           expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +int check_expected_long_long (signed long long int expected,
> +                           signed long long int  actual)
> +{
> +  /* Return TRUE if expected and actual values all match.  */
> +  if (expected != actual)
> +    {
> +#if DEBUG
> +      printf ("ERROR: Expected long long values don't match. \n");
> +      printf ("  Expected 0x%x, actual 0x%x\n",
> +           expected, actual);
> +#endif
> +      return FALSE;
> +    }
> +  return TRUE;
> +}
> +
> +void print_store_data (vector signed __int128 *store_data, int size)
> +{
> +#if DEBUG
> +  union conv_t val;
> +  int i;
> +  
> +  for  (i = 0; i < size; i++)
> +    {
> +      val.vsi128 = store_data[i];
> +      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], 
> val.ull[0]);
> +    }
> +#endif
> +}
> +
> +
> +void print_raw_buffer (vector signed __int128 *rawbuffer, int size)
> +{
> +#if DEBUG
> +  union conv_t val;
> +  int i;
> +  
> +  for  (i = 0; i < size; i++)
> +    {
> +      val.vsi128 = rawbuffer[i];
> +      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
> +    }
> +#endif
> +}
> +
> +int
> +main () {
> +  int i;
> +  
> +  vector signed __int128 rawbuffer[SIZE];
> +  signed char * vsbuffer_char = (signed char *)rawbuffer;
> +  signed short int * vsbuffer_short = (signed short int *)rawbuffer;
> +  signed int * vsbuffer_int = (signed int *)rawbuffer;
> +  signed long int * vsbuffer_long = (signed long *)rawbuffer;
> +  signed long long int * vsbuffer_long_long = (signed long long *)rawbuffer;
> +
> +  for  (i = 0; i < SIZE; i++)
> +    rawbuffer[i] = zero_vsint128;
> +
> +  print_store_data (store_data, SIZE);
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
> +                            vsbuffer_char);
> +      check_expected_byte (signed_char_expected[i], vsbuffer_char[i]);
> +    }
> +  
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
> +                            vsbuffer_short);
> +      check_expected_half (signed_short_expected[i], vsbuffer_short[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
> +                            vsbuffer_int);
> +      check_expected_int (signed_int_expected[i], vsbuffer_int[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
> +                                    vsbuffer_long);
> +      check_expected_long (signed_long_long_expected[i],
> +                        vsbuffer_long[i]);
> +    }
> +
> +  for  (i = 0; i < SIZE; i++)
> +    {
> +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long int),
> +                            vsbuffer_long_long);
> +      check_expected_long_long (signed_long_long_expected[i],
> +                             vsbuffer_long_long[i]);
> +    }
> +
> +  print_raw_buffer (rawbuffer, SIZE);
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
> +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */

Could you check the exact times instead?

BR,
Kewen

Reply via email to