On Mon, 2023-05-22 at 17:04 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> > GCC maintainers:
> > 
> > The following patch fixes errors in the arguments in the
> > __builtin_altivec_tr_stxvrhx,       __builtin_altivec_tr_stxvrwx
> > builtin
> > definitions.  Note, these builtins are used by the overloaded
> > __builtin_vec_xst_trunc builtin.
> > 
> > The patch adds a new overloaded builtin definition for
> > __builtin_vec_xst_trunc for the third argument to be unsigned and
> > signed long int.
> > 
> > A new testcase is added for the various overloaded versions of
> > __builtin_vec_xst_trunc.
> > 
> > The patch has been tested on Power 10 with no new regressions.
> > 
> > Please let me know if the patch is acceptable for
> > mainline.  Thanks.
> > 
> >                     Carl
> > 
> > -------------------------------------------
> > rs6000: Fix __builtin_vec_xst_trunc definition
> > 
> > Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx
> > and __builtin_altivec_tr_stxvrwx to handle the short and word
> > cases.  The
> > arguments for these two builtins are wrong.  This patch fixes the
> > wrong
> > arguments for the builtins.
> > 
> > Additionally, the patch adds a new __builtin_vec_xst_trunc
> > overloaded
> > version for the destination being signed or unsigned long int.
> > 
> > A runnable test case is added to test each of the overloaded
> > definitions
> > of __builtin_vec_xst_tru
> > 
> > gcc/
> >     * config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx,
> >     __builtin_altivec_tr_stxvrwx): Fix type of second argument.
> >     Add, definition for send argument to be signed long.
> >     * config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc):
> >     add definition with thrird arument signed and unsigned long.
> >     * doc/extend.texi (__builtin_vec_xst_trunc): Add documentation
> > for
> >     new unsinged long and signed long versions.
> > 
> > gcc/testsuite/
> >     * gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case
> >     for __builtin_vec_xst_trunc builtin.
> > ---
> >  gcc/config/rs6000/rs6000-builtins.def         |   7 +-
> >  gcc/config/rs6000/rs6000-overload.def         |   4 +
> >  gcc/doc/extend.texi                           |   2 +
> >  .../powerpc/vsx-builtin-vec_xst_trunc.c       | 217
> > ++++++++++++++++++
> >  4 files changed, 228 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index 638d0bc72ca..a378491b358 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -3161,12 +3161,15 @@
> >    void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char
> > *);
> >      TR_STXVRBX vsx_stxvrbx {stvec}
> >  
> > -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int
> > *);
> > +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > short *);
> >      TR_STXVRHX vsx_stxvrhx {stvec}
> >  
> > -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > short *);
> > +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int
> > *);
> >      TR_STXVRWX vsx_stxvrwx {stvec}
> 
> Good catching!
> 
> >  
> > +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long
> > *);
> > +    TR_STXVRLX vsx_stxvrdx {stvec}
> > +
> 
> This is mapped to the one used for type long long, it's a hard
> mapping,
> IMHO it's wrong and not consistent with what the users expect, since
> on Power
> the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
> this
> implementation binding to 8 bytes can cause trouble in 32-bit.  I
> wonder if
> it's a good idea to add one overloaded version for type long int, for
> now
> openxl also emits error message for long int type pointer (see its
> doc [1]),
> users can use casting to make it to the acceptable pointer types
> (long long
> or int as its size).
> 
> [1] 
> https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc
> 
> 

If I understand this correctly, the "signed long" is mapped to type
"long long int"?  Just curious, where is the mapping done?

So I believe you would like to have an additional overloaded
definition:

 void __builtin_vec_xst_trunc (vuq, signed long long, long *);

Am I understanding this correctly?  I added the above definition.

> >    void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long
> > long *);
> >      TR_STXVRDX vsx_stxvrdx {stvec}
> >  
> > diff --git a/gcc/config/rs6000/rs6000-overload.def
> > b/gcc/config/rs6000/rs6000-overload.def
> > index c582490c084..54b7ae5e51b 100644
> > --- a/gcc/config/rs6000/rs6000-overload.def
> > +++ b/gcc/config/rs6000/rs6000-overload.def
> > @@ -4872,6 +4872,10 @@
> >      TR_STXVRWX  TR_STXVRWX_S
> >    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > int *);
> >      TR_STXVRWX  TR_STXVRWX_U
> > +  void __builtin_vec_xst_trunc (vsq, signed long long, signed long
> > *);
> > +    TR_STXVRLX  TR_STXVRLX_S
> > +  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > long *);
> > +    TR_STXVRLX  TR_STXVRLX_U
> >    void __builtin_vec_xst_trunc (vsq, signed long long, signed long
> > long *);
> >      TR_STXVRDX  TR_STXVRDX_S
> >    void __builtin_vec_xst_trunc (vuq, signed long long, unsigned
> > long long *);
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index e426a2eb7d8..7e2ae790ab3 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -18570,10 +18570,12 @@ instructions.
> >  @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed char *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed short *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed int *)}
> > +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed
> > long long, signed long long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned char *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned short *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned int *)}
> > +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned long *)}
> >  @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128,
> > signed long long, unsigned long long *)}
> >  
> >  Truncate and store the rightmost element of a vector, as if
> > implemented by the
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-
> > vec_xst_trunc.c
> > new file mode 100644
> > index 00000000000..7108109560d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c
> > @@ -0,0 +1,217 @@
> > +/* Test of __builtin_vec_xst_trunc  */
> > +
> > +/* { dg-do run { target power10_hw } } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Normally _hw is stricter than _ok, you already check power10_hw,
> I think power10_ok is useless here.

OK, removed the  dg-require-effective-target power10_ok line.

> 
> > +/* { dg-require-effective-target int128 } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> > +
> > +#include <altivec.h>
> > +#include <stdio.h>
> > +#include <inttypes.h>
> > +#include <string.h>
> > +#include <stdlib.h>
> > +
> > +#define DEBUG 0
> > +#define TRUE  1
> > +#define FALSE 0
> > +#define SIZE  4
> > +
> > +vector signed __int128 zero_vsint128 = {0x0};
> > +
> > +vector signed __int128 store_data[SIZE] = {
> > +{ (__int128) 0x79BD000000000000 << 64 | (__int128)
> > 0x123456789abcdef8ULL},
> > +{ (__int128) 0x8ACE000000000000 << 64 | (__int128)
> > 0xfedcba9876543217ULL},
> > +{ (__int128) 0x1357000000000000 << 64 | (__int128)
> > 0xccccccccccccccccULL},
> > +{ (__int128) 0xf000000000000000 << 64 | (__int128)
> > 0xaaaaaaaaaaaaaaaaULL}
> > +};
> > +
> > +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC,
> > 0xAA};
> > +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217,
> > 0xcccc, 0xaaaa, };
> > +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217,
> > 0xCCCCCCCC,
> > +                                   0xAAAAAAAA};
> > +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8,
> > +                                                   0xFEDCBA9876543
> > 217ULL,
> > +                                                   0xCCCCCCCCCCCCC
> > CCCULL,
> > +                                                   0xAAAAAAAAAAAAA
> > AAAULL};
> > +signed long long int signed_long_long_expected[SIZE] =
> > {0x123456789ABCDEF8ULL,
> > +                                                   0xFEDCBA9876543
> > 217ULL,
> > +                                                   0xCCCCCCCCCCCCC
> > CCCULL,
> > +                                                   0xAAAAAAAAAAAAA
> > AAAULL};
> > +
> > +union conv_t {
> > +  vector signed __int128 vsi128;
> > +  unsigned long long ull[2];
> > +  signed char schar[16];
> > +  signed __int128 s128;
> > +} conv;
> > +
> > +int check_expected_byte (signed char expected,
> > +                    signed char actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected half values don't match. \n");
> > +      printf ("  Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n",
> > +         expected & 0xFF, actual & 0xFF);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_half (signed short int expected,
> > +                    signed short int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected short values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +         expected & 0xFFFF, actual & 0xFFFF);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_int (signed int expected,
> > +                   signed int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected int values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +         expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_long (signed long int expected,
> > +                    signed long int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected long values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +         expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +int check_expected_long_long (signed long long int expected,
> > +                         signed long long int  actual)
> > +{
> > +  /* Return TRUE if expected and actual values all match.  */
> > +  if (expected != actual)
> > +    {
> > +#if DEBUG
> > +      printf ("ERROR: Expected long long values don't match. \n");
> > +      printf ("  Expected 0x%x, actual 0x%x\n",
> > +         expected, actual);
> > +#endif
> > +      return FALSE;
> > +    }
> > +  return TRUE;
> > +}
> > +
> > +void print_store_data (vector signed __int128 *store_data, int
> > size)
> > +{
> > +#if DEBUG
> > +  union conv_t val;
> > +  int i;
> > +  
> > +  for  (i = 0; i < size; i++)
> > +    {
> > +      val.vsi128 = store_data[i];
> > +      printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1],
> > val.ull[0]);
> > +    }
> > +#endif
> > +}
> > +
> > +
> > +void print_raw_buffer (vector signed __int128 *rawbuffer, int
> > size)
> > +{
> > +#if DEBUG
> > +  union conv_t val;
> > +  int i;
> > +  
> > +  for  (i = 0; i < size; i++)
> > +    {
> > +      val.vsi128 = rawbuffer[i];
> > +      printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1],
> > val.ull[0]);
> > +    }
> > +#endif
> > +}
> > +
> > +int
> > +main () {
> > +  int i;
> > +  
> > +  vector signed __int128 rawbuffer[SIZE];
> > +  signed char * vsbuffer_char = (signed char *)rawbuffer;
> > +  signed short int * vsbuffer_short = (signed short int
> > *)rawbuffer;
> > +  signed int * vsbuffer_int = (signed int *)rawbuffer;
> > +  signed long int * vsbuffer_long = (signed long *)rawbuffer;
> > +  signed long long int * vsbuffer_long_long = (signed long long
> > *)rawbuffer;
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    rawbuffer[i] = zero_vsint128;
> > +
> > +  print_store_data (store_data, SIZE);
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(char),
> > +                          vsbuffer_char);
> > +      check_expected_byte (signed_char_expected[i],
> > vsbuffer_char[i]);
> > +    }
> > +  
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int),
> > +                          vsbuffer_short);
> > +      check_expected_half (signed_short_expected[i],
> > vsbuffer_short[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(int),
> > +                          vsbuffer_int);
> > +      check_expected_int (signed_int_expected[i],
> > vsbuffer_int[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int),
> > +                                  vsbuffer_long);
> > +      check_expected_long (signed_long_long_expected[i],
> > +                      vsbuffer_long[i]);
> > +    }
> > +
> > +  for  (i = 0; i < SIZE; i++)
> > +    {
> > +      __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long
> > int),
> > +                          vsbuffer_long_long);
> > +      check_expected_long_long (signed_long_long_expected[i],
> > +                           vsbuffer_long_long[i]);
> > +    }
> > +
> > +  print_raw_buffer (rawbuffer, SIZE);
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */
> > +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */
> 
> Could you check the exact times instead?

OK, added check for the instruction counts.
> 
> BR,
> Kewen
> 

Reply via email to