On Mon, 2023-05-22 at 17:04 +0800, Kewen.Lin wrote: > Hi Carl, > > on 2023/5/11 02:06, Carl Love via Gcc-patches wrote: > > GCC maintainers: > > > > The following patch fixes errors in the arguments in the > > __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx > > builtin > > definitions. Note, these builtins are used by the overloaded > > __builtin_vec_xst_trunc builtin. > > > > The patch adds a new overloaded builtin definition for > > __builtin_vec_xst_trunc for the third argument to be unsigned and > > signed long int. > > > > A new testcase is added for the various overloaded versions of > > __builtin_vec_xst_trunc. > > > > The patch has been tested on Power 10 with no new regressions. > > > > Please let me know if the patch is acceptable for > > mainline. Thanks. > > > > Carl > > > > ------------------------------------------- > > rs6000: Fix __builtin_vec_xst_trunc definition > > > > Built-in __builtin_vec_xst_trunc calls __builtin_altivec_tr_stxvrhx > > and __builtin_altivec_tr_stxvrwx to handle the short and word > > cases. The > > arguments for these two builtins are wrong. This patch fixes the > > wrong > > arguments for the builtins. > > > > Additionally, the patch adds a new __builtin_vec_xst_trunc > > overloaded > > version for the destination being signed or unsigned long int. > > > > A runnable test case is added to test each of the overloaded > > definitions > > of __builtin_vec_xst_tru > > > > gcc/ > > * config/rs6000/builtins.def (__builtin_altivec_tr_stxvrhx, > > __builtin_altivec_tr_stxvrwx): Fix type of second argument. > > Add, definition for send argument to be signed long. > > * config/rs6000/rs6000-overload.def (__builtin_vec_xst_trunc): > > add definition with thrird arument signed and unsigned long. > > * doc/extend.texi (__builtin_vec_xst_trunc): Add documentation > > for > > new unsinged long and signed long versions. > > > > gcc/testsuite/ > > * gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c: New test case > > for __builtin_vec_xst_trunc builtin. > > --- > > gcc/config/rs6000/rs6000-builtins.def | 7 +- > > gcc/config/rs6000/rs6000-overload.def | 4 + > > gcc/doc/extend.texi | 2 + > > .../powerpc/vsx-builtin-vec_xst_trunc.c | 217 > > ++++++++++++++++++ > > 4 files changed, 228 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-builtin- > > vec_xst_trunc.c > > > > diff --git a/gcc/config/rs6000/rs6000-builtins.def > > b/gcc/config/rs6000/rs6000-builtins.def > > index 638d0bc72ca..a378491b358 100644 > > --- a/gcc/config/rs6000/rs6000-builtins.def > > +++ b/gcc/config/rs6000/rs6000-builtins.def > > @@ -3161,12 +3161,15 @@ > > void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char > > *); > > TR_STXVRBX vsx_stxvrbx {stvec} > > > > - void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int > > *); > > + void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed > > short *); > > TR_STXVRHX vsx_stxvrhx {stvec} > > > > - void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed > > short *); > > + void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int > > *); > > TR_STXVRWX vsx_stxvrwx {stvec} > > Good catching! > > > > > + void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed long > > *); > > + TR_STXVRLX vsx_stxvrdx {stvec} > > + > > This is mapped to the one used for type long long, it's a hard > mapping, > IMHO it's wrong and not consistent with what the users expect, since > on Power > the size of type long int is 4 bytes at -m32 while 8 bytes at -m64, > this > implementation binding to 8 bytes can cause trouble in 32-bit. I > wonder if > it's a good idea to add one overloaded version for type long int, for > now > openxl also emits error message for long int type pointer (see its > doc [1]), > users can use casting to make it to the acceptable pointer types > (long long > or int as its size). > > [1] > https://www.ibm.com/docs/en/openxl-c-and-cpp-lop/17.1.1?topic=functions-vec-xst-trunc > >
If I understand this correctly, the "signed long" is mapped to type "long long int"? Just curious, where is the mapping done? So I believe you would like to have an additional overloaded definition: void __builtin_vec_xst_trunc (vuq, signed long long, long *); Am I understanding this correctly? I added the above definition. > > void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long > > long *); > > TR_STXVRDX vsx_stxvrdx {stvec} > > > > diff --git a/gcc/config/rs6000/rs6000-overload.def > > b/gcc/config/rs6000/rs6000-overload.def > > index c582490c084..54b7ae5e51b 100644 > > --- a/gcc/config/rs6000/rs6000-overload.def > > +++ b/gcc/config/rs6000/rs6000-overload.def > > @@ -4872,6 +4872,10 @@ > > TR_STXVRWX TR_STXVRWX_S > > void __builtin_vec_xst_trunc (vuq, signed long long, unsigned > > int *); > > TR_STXVRWX TR_STXVRWX_U > > + void __builtin_vec_xst_trunc (vsq, signed long long, signed long > > *); > > + TR_STXVRLX TR_STXVRLX_S > > + void __builtin_vec_xst_trunc (vuq, signed long long, unsigned > > long *); > > + TR_STXVRLX TR_STXVRLX_U > > void __builtin_vec_xst_trunc (vsq, signed long long, signed long > > long *); > > TR_STXVRDX TR_STXVRDX_S > > void __builtin_vec_xst_trunc (vuq, signed long long, unsigned > > long long *); > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > > index e426a2eb7d8..7e2ae790ab3 100644 > > --- a/gcc/doc/extend.texi > > +++ b/gcc/doc/extend.texi > > @@ -18570,10 +18570,12 @@ instructions. > > @defbuiltin{{void} vec_xst_trunc (vector signed __int128, signed > > long long, signed char *)} > > @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed > > long long, signed short *)} > > @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed > > long long, signed int *)} > > +@defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed > > long long, signed long *)} > > @defbuiltinx{{void} vec_xst_trunc (vector signed __int128, signed > > long long, signed long long *)} > > @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, > > signed long long, unsigned char *)} > > @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, > > signed long long, unsigned short *)} > > @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, > > signed long long, unsigned int *)} > > +@defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, > > signed long long, unsigned long *)} > > @defbuiltinx{{void} vec_xst_trunc (vector unsigned __int128, > > signed long long, unsigned long long *)} > > > > Truncate and store the rightmost element of a vector, as if > > implemented by the > > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin- > > vec_xst_trunc.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin- > > vec_xst_trunc.c > > new file mode 100644 > > index 00000000000..7108109560d > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-vec_xst_trunc.c > > @@ -0,0 +1,217 @@ > > +/* Test of __builtin_vec_xst_trunc */ > > + > > +/* { dg-do run { target power10_hw } } */ > > +/* { dg-require-effective-target power10_ok } */ > > Normally _hw is stricter than _ok, you already check power10_hw, > I think power10_ok is useless here. OK, removed the dg-require-effective-target power10_ok line. > > > +/* { dg-require-effective-target int128 } */ > > +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */ > > + > > +#include <altivec.h> > > +#include <stdio.h> > > +#include <inttypes.h> > > +#include <string.h> > > +#include <stdlib.h> > > + > > +#define DEBUG 0 > > +#define TRUE 1 > > +#define FALSE 0 > > +#define SIZE 4 > > + > > +vector signed __int128 zero_vsint128 = {0x0}; > > + > > +vector signed __int128 store_data[SIZE] = { > > +{ (__int128) 0x79BD000000000000 << 64 | (__int128) > > 0x123456789abcdef8ULL}, > > +{ (__int128) 0x8ACE000000000000 << 64 | (__int128) > > 0xfedcba9876543217ULL}, > > +{ (__int128) 0x1357000000000000 << 64 | (__int128) > > 0xccccccccccccccccULL}, > > +{ (__int128) 0xf000000000000000 << 64 | (__int128) > > 0xaaaaaaaaaaaaaaaaULL} > > +}; > > + > > +signed char signed_char_expected[SIZE] = {0xF8ULL, 0x17, 0xCC, > > 0xAA}; > > +signed short signed_short_expected[SIZE] = {0xDEF8, 0x3217, > > 0xcccc, 0xaaaa, }; > > +signed int signed_int_expected[SIZE] = {0x9ABCDEF8, 0x76543217, > > 0xCCCCCCCC, > > + 0xAAAAAAAA}; > > +signed long int signed_long_expected[SIZE] = {0x123456789ABCDEF8, > > + 0xFEDCBA9876543 > > 217ULL, > > + 0xCCCCCCCCCCCCC > > CCCULL, > > + 0xAAAAAAAAAAAAA > > AAAULL}; > > +signed long long int signed_long_long_expected[SIZE] = > > {0x123456789ABCDEF8ULL, > > + 0xFEDCBA9876543 > > 217ULL, > > + 0xCCCCCCCCCCCCC > > CCCULL, > > + 0xAAAAAAAAAAAAA > > AAAULL}; > > + > > +union conv_t { > > + vector signed __int128 vsi128; > > + unsigned long long ull[2]; > > + signed char schar[16]; > > + signed __int128 s128; > > +} conv; > > + > > +int check_expected_byte (signed char expected, > > + signed char actual) > > +{ > > + /* Return TRUE if expected and actual values all match. */ > > + if (expected != actual) > > + { > > +#if DEBUG > > + printf ("ERROR: Expected half values don't match. \n"); > > + printf (" Expected 0x%x & 0xFFFF, actual 0x%x & 0xFFFF\n", > > + expected & 0xFF, actual & 0xFF); > > +#endif > > + return FALSE; > > + } > > + return TRUE; > > +} > > + > > +int check_expected_half (signed short int expected, > > + signed short int actual) > > +{ > > + /* Return TRUE if expected and actual values all match. */ > > + if (expected != actual) > > + { > > +#if DEBUG > > + printf ("ERROR: Expected short values don't match. \n"); > > + printf (" Expected 0x%x, actual 0x%x\n", > > + expected & 0xFFFF, actual & 0xFFFF); > > +#endif > > + return FALSE; > > + } > > + return TRUE; > > +} > > + > > +int check_expected_int (signed int expected, > > + signed int actual) > > +{ > > + /* Return TRUE if expected and actual values all match. */ > > + if (expected != actual) > > + { > > +#if DEBUG > > + printf ("ERROR: Expected int values don't match. \n"); > > + printf (" Expected 0x%x, actual 0x%x\n", > > + expected, actual); > > +#endif > > + return FALSE; > > + } > > + return TRUE; > > +} > > + > > +int check_expected_long (signed long int expected, > > + signed long int actual) > > +{ > > + /* Return TRUE if expected and actual values all match. */ > > + if (expected != actual) > > + { > > +#if DEBUG > > + printf ("ERROR: Expected long values don't match. \n"); > > + printf (" Expected 0x%x, actual 0x%x\n", > > + expected, actual); > > +#endif > > + return FALSE; > > + } > > + return TRUE; > > +} > > + > > +int check_expected_long_long (signed long long int expected, > > + signed long long int actual) > > +{ > > + /* Return TRUE if expected and actual values all match. */ > > + if (expected != actual) > > + { > > +#if DEBUG > > + printf ("ERROR: Expected long long values don't match. \n"); > > + printf (" Expected 0x%x, actual 0x%x\n", > > + expected, actual); > > +#endif > > + return FALSE; > > + } > > + return TRUE; > > +} > > + > > +void print_store_data (vector signed __int128 *store_data, int > > size) > > +{ > > +#if DEBUG > > + union conv_t val; > > + int i; > > + > > + for (i = 0; i < size; i++) > > + { > > + val.vsi128 = store_data[i]; > > + printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], > > val.ull[0]); > > + } > > +#endif > > +} > > + > > + > > +void print_raw_buffer (vector signed __int128 *rawbuffer, int > > size) > > +{ > > +#if DEBUG > > + union conv_t val; > > + int i; > > + > > + for (i = 0; i < size; i++) > > + { > > + val.vsi128 = rawbuffer[i]; > > + printf ("rawbuffer[%d] = 0x%llx %llx\n", i, val.ull[1], > > val.ull[0]); > > + } > > +#endif > > +} > > + > > +int > > +main () { > > + int i; > > + > > + vector signed __int128 rawbuffer[SIZE]; > > + signed char * vsbuffer_char = (signed char *)rawbuffer; > > + signed short int * vsbuffer_short = (signed short int > > *)rawbuffer; > > + signed int * vsbuffer_int = (signed int *)rawbuffer; > > + signed long int * vsbuffer_long = (signed long *)rawbuffer; > > + signed long long int * vsbuffer_long_long = (signed long long > > *)rawbuffer; > > + > > + for (i = 0; i < SIZE; i++) > > + rawbuffer[i] = zero_vsint128; > > + > > + print_store_data (store_data, SIZE); > > + > > + for (i = 0; i < SIZE; i++) > > + { > > + __builtin_vec_xst_trunc (store_data[i], i*sizeof(char), > > + vsbuffer_char); > > + check_expected_byte (signed_char_expected[i], > > vsbuffer_char[i]); > > + } > > + > > + for (i = 0; i < SIZE; i++) > > + { > > + __builtin_vec_xst_trunc (store_data[i], i*sizeof(short int), > > + vsbuffer_short); > > + check_expected_half (signed_short_expected[i], > > vsbuffer_short[i]); > > + } > > + > > + for (i = 0; i < SIZE; i++) > > + { > > + __builtin_vec_xst_trunc (store_data[i], i*sizeof(int), > > + vsbuffer_int); > > + check_expected_int (signed_int_expected[i], > > vsbuffer_int[i]); > > + } > > + > > + for (i = 0; i < SIZE; i++) > > + { > > + __builtin_vec_xst_trunc (store_data[i], i*sizeof(long int), > > + vsbuffer_long); > > + check_expected_long (signed_long_long_expected[i], > > + vsbuffer_long[i]); > > + } > > + > > + for (i = 0; i < SIZE; i++) > > + { > > + __builtin_vec_xst_trunc (store_data[i], i*sizeof(long long > > int), > > + vsbuffer_long_long); > > + check_expected_long_long (signed_long_long_expected[i], > > + vsbuffer_long_long[i]); > > + } > > + > > + print_raw_buffer (rawbuffer, SIZE); > > + return 0; > > +} > > + > > +/* { dg-final { scan-assembler {\mstxvrbx\M} } } */ > > +/* { dg-final { scan-assembler {\mstxvrhx\M} } } */ > > +/* { dg-final { scan-assembler {\mstxvrwx\M} } } */ > > +/* { dg-final { scan-assembler {\mstxvrdx\M} } } */ > > Could you check the exact times instead? OK, added check for the instruction counts. > > BR, > Kewen >