On Fri, 2023-07-21 at 13:04 +0800, Kewen.Lin wrote: > Hi Carl, > > on 2023/7/18 03:20, Carl Love wrote: > > GCC maintainers: > > > > Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case > > statement > > rs6000/rs6000-c.cc. The existing REPLACE_ELT iterator name was > > changed > > to REPLACE_ELT_V along with the associated > > define_mode_attr. Renamed > > VEC_RU to REPLACE_ELT for the iterator name and VEC_RU_char to > > REPLACE_ELT_char. Fixed the double test in vec-replace-word- > > runnable_1.c to be consistent with the other tests. Removed the > > "dg-do > > link" from both tests. Put in an explicit cast in test vec- > > replace-word-runnable_2.c to eliminate the need for the -flax- > > vector-conversions dg-option. > > > > Version 3, added code to altivec_resolve_overloaded_builtin so the > > correct instruction is selected for the size of the second > > argument. > > This restores the instruction counts to the original values where > > the > > correct instructions were originally being generated. The naming > > of > > the overloaded builtin instances and builtin definitions were > > changed > > to reflect the type of the second argument since the type of the > > first > > argument is now the same for all overloaded instances. A new > > builtin > > test file was added for the case where the first argument is cast > > to > > the unsigned long long type. This test requires the -flax-vector- > > conversions gcc command line option. Since the other tests do not > > require this option, I felt that the new test needed to be in a > > separate file. Finally some formatting fixes were made in the > > original > > test file. Patch has been retested on Power 10 with no > > regressions. > > > > Version 2, fixed various typos. Updated the change log body to say > > the > > instruction counts were updated. The instruction counts changed as > > a > > result of changing the first argument of the vec_replace_unaligned > > builtin call from vector unsigned long long (vull) to vector > > unsigned > > char (vuc). When the first argument was vull the builtin call > > generated the vinsd instruction for the two test cases. The > > updated > > call with vuc as the first argument generates two vinsw > > instructions > > instead. Patch was retested on Power 10 with no regressions. > > > > The following patch fixes the first argument in the builtin > > definition > > and the corresponding test cases. Initially, the builtin > > specification > > was wrong due to a cut and past error. The documentation was fixed > > in: > > > > commit ed3fea09b18f67e757b5768b42cb6e816626f1db > > Author: Bill Schmidt <wschm...@linux.ibm.com> > > Date: Fri Feb 4 13:07:17 2022 -0600 > > > > rs6000: Correct function prototypes for > > vec_replace_unaligned > > > > Due to a pasto error in the documentation, > > vec_replace_unaligned was > > implemented with the same function prototypes as > > vec_replace_elt. > > It was intended that vec_replace_unaligned always specify > > output > > vectors as having type vector unsigned char, to emphasize > > that > > elements are potentially misaligned by this built-in > > function. > > This patch corrects the misimplementation. > > .... > > > > This patch fixes the arguments in the definitions and updates the > > testcases accordingly. Additionally, a few minor spacing issues > > are > > fixed. > > > > The patch has been tested on Power 10 with no regressions. Please > > let > > me know if the patch is acceptable for mainline. Thanks. > > > > Carl > > > > > > ---------------------------- > > rs6000, fix vec_replace_unaligned built-in arguments > > > > The first argument of the vec_replace_unaligned built-in should > > always be > > of type unsigned char, as specified in gcc/doc/extend.texi. > > Shouldn't be "vector unsigned char" instead of "unsigned char"? > > Or do I miss something?
Nope, I missed saying "vector". Fixed. > > > This patch fixes the builtin definitions and updates the test cases > > to use > > the correct arguments. The original test file is renamed and a > > second test > > file is added for a new test case. > > > > gcc/ChangeLog: > > * config/rs6000/rs6000-builtins.def: Rename > > __builtin_altivec_vreplace_un_uv2di as > > __builtin_altivec_vreplace_un_udi > > __builtin_altivec_vreplace_un_uv4si as > > __builtin_altivec_vreplace_un_usi > > __builtin_altivec_vreplace_un_v2df as > > __builtin_altivec_vreplace_un_df > > __builtin_altivec_vreplace_un_v2di as > > __builtin_altivec_vreplace_un_di > > __builtin_altivec_vreplace_un_v4sf as > > __builtin_altivec_vreplace_un_sf > > __builtin_altivec_vreplace_un_v4si as > > __builtin_altivec_vreplace_un_si. > > Rename VREPLACE_UN_UV2DI as VREPLACE_UN_UDI, VREPLACE_UN_UV4SI > > as > > VREPLACE_UN_USI, VREPLACE_UN_V2DF as VREPLACE_UN_DF, > > VREPLACE_UN_V2DI as VREPLACE_UN_DI, VREPLACE_UN_V4SF as > > VREPLACE_UN_SF, VREPLACE_UN_V4SI as VREPLACE_UN_SI. > > Rename vreplace_un_v2di as vreplace_un_di, vreplace_un_v4si as > > vreplace_un_si, vreplace_un_v2df as vreplace_un_df, > > vreplace_un_v2di as vreplace_un_di, vreplace_un_v4sf as > > vreplace_un_sf, vreplace_un_v4si as vreplace_un_si. > > * config/rs6000/rs6000-c.cc (find_instance): Add case > > RS6000_OVLD_VEC_REPLACE_UN. > > * config/rs6000/rs6000-overload.def (__builtin_vec_replace_un): > > Fix first argument type. Rename VREPLACE_UN_UV4SI as > > VREPLACE_UN_USI, VREPLACE_UN_V4SI as VREPLACE_UN_SI, > > VREPLACE_UN_UV2DI as VREPLACE_UN_UDI, VREPLACE_UN_V2DI as > > VREPLACE_UN_DI, VREPLACE_UN_V4SF as VREPLACE_UN_SF, > > VREPLACE_UN_V2DF as VREPLACE_UN_DF. > > * config/rs6000/vsx.md (REPLACE_ELT): Renamed the mode_iterator > > REPLACE_ELT_V. > > (REPLACE_ELT_char): Renamed REPLACE_ELT_V_char. > > (REPLACE_ELT_sh): Renamed REPLACE_ELT_V_sh. > > (REPLACE_ELT_max): Renamed REPLACE_ELT_V_max. > > (REPLACE_ELT): New mode iterator. > > (REPLACE_ELT_char): New mode attribute. > > (vreplace_un_<mode>): Change iterator and mode attribute. > > > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/vec-replace-word-runnable.c: Renamed > > vec-replace-word-runnable_1.c. > > * gcc.target/powerpc/vec-replace-word-runnable_1.c > > (dg-options): add -flax-vector-conversions. > > (vec_replace_unaligned) Fix first argument type. > > (vresult_uchar): Fix expected results. > > (vec_replace_unaligned): Update for loop to check uchar > > results. > > Remove extra spaces in if statements. Insert missing spaces in > > for statements. > > * gcc.target/powerpc/vec-replace-word-runnable_2.c: New test > > file. > > --- > > gcc/config/rs6000/rs6000-builtins.def | 28 +-- > > gcc/config/rs6000/rs6000-c.cc | 23 +++ > > gcc/config/rs6000/rs6000-overload.def | 24 +-- > > gcc/config/rs6000/vsx.md | 52 ++--- > > ...nnable.c => vec-replace-word-runnable_1.c} | 177 ++++++++++-- > > ------ > > .../powerpc/vec-replace-word-runnable_2.c | 50 +++++ > > 6 files changed, 224 insertions(+), 130 deletions(-) > > rename gcc/testsuite/gcc.target/powerpc/{vec-replace-word- > > runnable.c => vec-replace-word-runnable_1.c} (51%) > > create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-replace- > > word-runnable_2.c > > > > diff --git a/gcc/config/rs6000/rs6000-builtins.def > > b/gcc/config/rs6000/rs6000-builtins.def > > index c0ef717161f..58c53cd462e 100644 > > --- a/gcc/config/rs6000/rs6000-builtins.def > > +++ b/gcc/config/rs6000/rs6000-builtins.def > > @@ -3397,26 +3397,26 @@ > > const vull __builtin_altivec_vpextd (vull, vull); > > VPEXTD vpextd {} > > > > - const vuc __builtin_altivec_vreplace_un_uv2di (vull, unsigned > > long long, \ > > - const int<4>); > > - VREPLACE_UN_UV2DI vreplace_un_v2di {} > > + const vuc __builtin_altivec_vreplace_un_udi (vuc, unsigned long > > long, \ > > + const int<4>); > > + VREPLACE_UN_UDI vreplace_un_di {} > > > > - const vuc __builtin_altivec_vreplace_un_uv4si (vui, unsigned > > int, \ > > - const int<4>); > > - VREPLACE_UN_UV4SI vreplace_un_v4si {} > > + const vuc __builtin_altivec_vreplace_un_usi (vuc, unsigned int, > > \ > > + const int<4>); > > + VREPLACE_UN_USI vreplace_un_si {} > > > > - const vuc __builtin_altivec_vreplace_un_v2df (vd, double, const > > int<4>); > > - VREPLACE_UN_V2DF vreplace_un_v2df {} > > + const vuc __builtin_altivec_vreplace_un_df (vuc, double, const > > int<4>); > > + VREPLACE_UN_DF vreplace_un_df {} > > > > - const vuc __builtin_altivec_vreplace_un_v2di (vsll, signed long > > long, \ > > + const vuc __builtin_altivec_vreplace_un_di (vuc, signed long > > long, \ > > const int<4>); > > - VREPLACE_UN_V2DI vreplace_un_v2di {} > > + VREPLACE_UN_DI vreplace_un_di {} > > > > - const vuc __builtin_altivec_vreplace_un_v4sf (vf, float, const > > int<4>); > > - VREPLACE_UN_V4SF vreplace_un_v4sf {} > > + const vuc __builtin_altivec_vreplace_un_sf (vuc, float, const > > int<4>); > > + VREPLACE_UN_SF vreplace_un_sf {} > > > > - const vuc __builtin_altivec_vreplace_un_v4si (vsi, signed int, > > const int<4>); > > - VREPLACE_UN_V4SI vreplace_un_v4si {} > > + const vuc __builtin_altivec_vreplace_un_si (vuc, signed int, > > const int<4>); > > + VREPLACE_UN_SI vreplace_un_si {} > > > > const vull __builtin_altivec_vreplace_uv2di (vull, unsigned long > > long, \ > > const int<1>); > > diff --git a/gcc/config/rs6000/rs6000-c.cc > > b/gcc/config/rs6000/rs6000-c.cc > > index 350987b851b..63ab360d352 100644 > > --- a/gcc/config/rs6000/rs6000-c.cc > > +++ b/gcc/config/rs6000/rs6000-c.cc > > @@ -1968,6 +1968,29 @@ altivec_resolve_overloaded_builtin > > (location_t loc, tree fndecl, > > instance_code = RS6000_BIF_VSIEDP; > > } > > > > + tree call = find_instance (&unsupported_builtin, &instance, > > + instance_code, fcode, types, args, > > nargs); > > + if (call != error_mark_node) > > + return call; > > + break; > > + } > > + case RS6000_OVLD_VEC_REPLACE_UN: > > + { > > + machine_mode arg2_mode = TYPE_MODE (types[1]); > > + > > + if (arg2_mode == SImode) > > + /* Signed and unsigned are handled the same. */ > > + instance_code = RS6000_BIF_VREPLACE_UN_USI; > > + else if (arg2_mode == SFmode) > > + instance_code = RS6000_BIF_VREPLACE_UN_SF; > > + else if (arg2_mode == DImode) > > + /* Signed and unsigned are handled the same. */ > > + instance_code = RS6000_BIF_VREPLACE_UN_UDI; > > + else if (arg2_mode == DFmode) > > + instance_code = RS6000_BIF_VREPLACE_UN_DF; > > + else > > + break; > > + > > tree call = find_instance (&unsupported_builtin, &instance, > > instance_code, fcode, types, args, > > nargs); > > if (call != error_mark_node) > > diff --git a/gcc/config/rs6000/rs6000-overload.def > > b/gcc/config/rs6000/rs6000-overload.def > > index 470d718efde..b83946f5ad8 100644 > > --- a/gcc/config/rs6000/rs6000-overload.def > > +++ b/gcc/config/rs6000/rs6000-overload.def > > @@ -3059,18 +3059,18 @@ > > VREPLACE_ELT_V2DF > > > > [VEC_REPLACE_UN, vec_replace_unaligned, __builtin_vec_replace_un] > > - vuc __builtin_vec_replace_un (vui, unsigned int, const int); > > - VREPLACE_UN_UV4SI > > - vuc __builtin_vec_replace_un (vsi, signed int, const int); > > - VREPLACE_UN_V4SI > > - vuc __builtin_vec_replace_un (vull, unsigned long long, const > > int); > > - VREPLACE_UN_UV2DI > > - vuc __builtin_vec_replace_un (vsll, signed long long, const > > int); > > - VREPLACE_UN_V2DI > > - vuc __builtin_vec_replace_un (vf, float, const int); > > - VREPLACE_UN_V4SF > > - vuc __builtin_vec_replace_un (vd, double, const int); > > - VREPLACE_UN_V2DF > > + vuc __builtin_vec_replace_un (vuc, unsigned int, const int); > > + VREPLACE_UN_USI > > + vuc __builtin_vec_replace_un (vuc, signed int, const int); > > + VREPLACE_UN_SI > > + vuc __builtin_vec_replace_un (vuc, unsigned long long, const > > int); > > + VREPLACE_UN_UDI > > + vuc __builtin_vec_replace_un (vuc, signed long long, const int); > > + VREPLACE_UN_DI > > + vuc __builtin_vec_replace_un (vuc, float, const int); > > + VREPLACE_UN_SF > > + vuc __builtin_vec_replace_un (vuc, double, const int); > > + VREPLACE_UN_DF > > > > [VEC_REVB, vec_revb, __builtin_vec_revb] > > vss __builtin_vec_revb (vss); > > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > > index 0c269e4e8d9..d76f9f4864e 100644 > > --- a/gcc/config/rs6000/vsx.md > > +++ b/gcc/config/rs6000/vsx.md > > @@ -381,13 +381,20 @@ (define_int_attr > > xvcvbf16 [(UNSPEC_VSX_XVCVSPBF16 "xvcvspbf16") > > (define_mode_iterator VI2 [V4SI V8HI V16QI V2DI]) > > > > ;; Vector extract_elt iterator/attr for 32-bit and 64-bit elements Missed this earlier, I think the comment needs fixing as well since we now have both vector and scalar now. Changed to: ;; Vector and scalar extract_elt iterator/attr for 32-bit and 64-bit elements > > -(define_mode_iterator REPLACE_ELT [V4SI V4SF V2DI V2DF]) > > -(define_mode_attr REPLACE_ELT_char [(V4SI "w") (V4SF "w") > > - (V2DI "d") (V2DF "d")]) > > -(define_mode_attr REPLACE_ELT_sh [(V4SI "2") (V4SF "2") > > - (V2DI "3") (V2DF "3")]) > > -(define_mode_attr REPLACE_ELT_max [(V4SI "12") (V4SF "12") > > - (V2DI "8") (V2DF "8")]) > > +(define_mode_iterator REPLACE_ELT_V [V4SI V4SF V2DI V2DF]) > > +(define_mode_attr REPLACE_ELT_V_char [(V4SI "w") (V4SF "w") > > + (V2DI "d") (V2DF "d")]) > > +(define_mode_attr REPLACE_ELT_V_sh [(V4SI "2") (V4SF "2") > > + (V2DI "3") (V2DF "3")]) > > +(define_mode_attr REPLACE_ELT_V_max [(V4SI "12") (V4SF "12") > > + (V2DI "8") (V2DF "8")]) > > + > > +;; vector replace unaligned modes > > +(define_mode_iterator REPLACE_ELT [SI SF DI DF]) > > +(define_mode_attr REPLACE_ELT_char [(SI "w") > > + (SF "w") > > + (DI "d") > > + (DF "d")]) > > Sorry if I didn't describe it clearly, I actually adviced that: I think maybe it got lost in the first attempt to change this which didn't work. Fixed it. > > (define_mode_iterator REPLACE_ELT_V [V4SI V4SF V2DI V2DF]) > (define_mode_iterator REPLACE_ELT [SI SF DI DF]) > (define_mode_attr REPLACE_ELT_char [(V4SI "w") (V4SF "w") > (V2DI "d") (V2DF "d") > (SI "w") (SF "w") > (DI "d") (DF "d")]) > Having a single mode_attr for scalar and vector is a bit cleaner. > It leaves REPLACE_ELT_sh and REPLACE_ELT_max unchanged, otherwise > we have to update the preparation statements like what you did in > this patch, IMHO both aligned and unaligned versions are replacing > things with the element size, it's not quite meaningful to further > distinguish the mode attributes. OK, so I reverted REPLACE_ELT_V_sh back to REPLACE_ELT_sh and REPLACE_ELT_V_max back to REPLACE_ELT_max. Updated ChangeLog. > > > > > ;; Like VM2 in altivec.md, just do char, short, int, long, float > > and double > > (define_mode_iterator VM3 [V4SI > > @@ -4183,21 +4190,22 @@ (define_insn "vinsertgr_internal_<mode>" > > [(set_attr "type" "vecsimple")]) > > > > (define_expand "vreplace_elt_<mode>" > > - [(set (match_operand:REPLACE_ELT 0 "register_operand") > > - (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 > > "register_operand") > > - (match_operand:<VEC_base> 2 "register_operand") > > - (match_operand:QI 3 "const_0_to_3_operand")] > > - UNSPEC_REPLACE_ELT))] > > + [(set (match_operand:REPLACE_ELT_V 0 "register_operand") > > + (unspec:REPLACE_ELT_V [(match_operand:REPLACE_ELT_V 1 > > "register_operand") > > + (match_operand:<VEC_base> 2 > > "register_operand") > > + (match_operand:QI 3 "const_0_to_3_operand")] > > + UNSPEC_REPLACE_ELT))] > > "TARGET_POWER10" > > { > > int index; > > /* Immediate value is the word index, convert to byte index and > > adjust for > > Endianness if needed. */ > > if (BYTES_BIG_ENDIAN) > > - index = INTVAL (operands[3]) << <REPLACE_ELT_sh>; > > + index = INTVAL (operands[3]) << <REPLACE_ELT_V_sh>; > > > > else > > - index = <REPLACE_ELT_max> - (INTVAL (operands[3]) << > > <REPLACE_ELT_sh>); > > + index = <REPLACE_ELT_V_max> > > + - (INTVAL (operands[3]) << <REPLACE_ELT_V_sh>); > > > > emit_insn (gen_vreplace_elt_<mode>_inst (operands[0], > > operands[1], > > operands[2], > > @@ -4207,19 +4215,19 @@ (define_expand "vreplace_elt_<mode>" > > [(set_attr "type" "vecsimple")]) > > > > (define_insn "vreplace_elt_<mode>_inst" > > - [(set (match_operand:REPLACE_ELT 0 "register_operand" "=v") > > - (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 > > "register_operand" "0") > > - (match_operand:<VEC_base> 2 "register_operand" > > "r") > > - (match_operand:QI 3 "const_0_to_12_operand" > > "n")] > > - UNSPEC_REPLACE_ELT))] > > + [(set (match_operand:REPLACE_ELT_V 0 "register_operand" "=v") > > + (unspec:REPLACE_ELT_V [(match_operand:REPLACE_ELT_V 1 > > "register_operand" "0") > > + (match_operand:<VEC_base> 2 "register_operand" > > "r") > > + (match_operand:QI 3 "const_0_to_12_operand" > > "n")] > > + UNSPEC_REPLACE_ELT))] > > "TARGET_POWER10" > > - "vins<REPLACE_ELT_char> %0,%2,%3" > > + "vins<REPLACE_ELT_V_char> %0,%2,%3" > > [(set_attr "type" "vecsimple")]) > > > > (define_insn "vreplace_un_<mode>" > > [(set (match_operand:V16QI 0 "register_operand" "=v") > > - (unspec:V16QI [(match_operand:REPLACE_ELT 1 "register_operand" > > "0") > > - (match_operand:<VEC_base> 2 "register_operand" > > "r") > > + (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "0") > > + (match_operand:REPLACE_ELT 2 "register_operand" "r") > > (match_operand:QI 3 "const_0_to_12_operand" "n")] > > UNSPEC_REPLACE_UN))] > > "TARGET_POWER10" > > diff --git a/gcc/testsuite/gcc.target/powerpc/vec-replace-word- > > runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-replace-word- > > runnable_1.c > > similarity index 51% > > rename from gcc/testsuite/gcc.target/powerpc/vec-replace-word- > > runnable.c > > rename to gcc/testsuite/gcc.target/powerpc/vec-replace-word- > > runnable_1.c > > index 27318822871..7e7485ca371 100644 > > --- a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c > > +++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word- > > runnable_1.c > > @@ -1,5 +1,4 @@ > > /* { dg-do run { target { power10_hw } } } */ > > -/* { dg-do link { target { ! power10_hw } } } */ > > As your explanation that this "dg-do link" has existed for a long > time, > I'd a check and found it's from r11-4397-g8d8fef197114a9 "[RS6000] > Link > power10 testcases", I think it wanted to test assembler and linker > when > there are few actual P10 hw (imagining at the early stage of P10 > support). > > Since there are a few test cases like this, I'm inclined to just > leave it OK, put it back in. Carl