On Fri, 2023-07-21 at 13:04 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> on 2023/7/18 03:20, Carl Love wrote:
> > GCC maintainers:
> > 
> > Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case
> > statement
> > rs6000/rs6000-c.cc.  The existing REPLACE_ELT iterator name was
> > changed
> > to REPLACE_ELT_V along with the associated
> > define_mode_attr.  Renamed
> > VEC_RU to REPLACE_ELT for the iterator name and VEC_RU_char to
> > REPLACE_ELT_char.  Fixed the double test in vec-replace-word-
> > runnable_1.c to be consistent with the other tests.  Removed the
> > "dg-do 
> > link" from both tests.  Put in an explicit cast in test vec-
> > replace-word-runnable_2.c to eliminate the need for the -flax-
> > vector-conversions dg-option.
> > 
> > Version 3, added code to altivec_resolve_overloaded_builtin so the
> > correct instruction is selected for the size of the second
> > argument. 
> > This restores the instruction counts to the original values where
> > the
> > correct instructions were originally being generated.  The naming
> > of
> > the overloaded builtin instances and builtin definitions were
> > changed
> > to reflect the type of the second argument since the type of the
> > first
> > argument is now the same for all overloaded instances.  A new
> > builtin
> > test file was added for the case where the first argument is cast
> > to
> > the unsigned long long type.  This test requires the -flax-vector-
> > conversions gcc command line option.  Since the other tests do not
> > require this option, I felt that the new test needed to be in a
> > separate file.  Finally some formatting fixes were made in the
> > original
> > test file.  Patch has been retested on Power 10 with no
> > regressions.
> > 
> > Version 2, fixed various typos.  Updated the change log body to say
> > the
> > instruction counts were updated.  The instruction counts changed as
> > a
> > result of changing the first argument of the vec_replace_unaligned
> > builtin call from vector unsigned long long (vull) to vector
> > unsigned
> > char (vuc).  When the first argument was vull the builtin call
> > generated the vinsd instruction for the two test cases.  The
> > updated
> > call with vuc as the first argument generates two vinsw
> > instructions
> > instead.  Patch was retested on Power 10 with no regressions.
> > 
> > The following patch fixes the first argument in the builtin
> > definition
> > and the corresponding test cases.  Initially, the builtin
> > specification
> > was wrong due to a cut and past error.  The documentation was fixed
> > in:
> > 
> >    commit ed3fea09b18f67e757b5768b42cb6e816626f1db
> >    Author: Bill Schmidt <wschm...@linux.ibm.com>
> >    Date:   Fri Feb 4 13:07:17 2022 -0600
> > 
> >        rs6000: Correct function prototypes for
> > vec_replace_unaligned
> > 
> >        Due to a pasto error in the documentation,
> > vec_replace_unaligned was
> >        implemented with the same function prototypes as
> > vec_replace_elt.  
> >        It was intended that vec_replace_unaligned always specify
> > output
> >        vectors as having type vector unsigned char, to emphasize
> > that 
> >        elements are potentially misaligned by this built-in
> > function.  
> >        This patch corrects the misimplementation.
> >     ....
> > 
> > This patch fixes the arguments in the definitions and updates the
> > testcases accordingly.  Additionally, a few minor spacing issues
> > are
> > fixed.
> > 
> > The patch has been tested on Power 10 with no regressions.  Please
> > let
> > me know if the patch is acceptable for mainline.  Thanks.
> > 
> >                  Carl 
> > 
> > 
> > ----------------------------
> > rs6000, fix vec_replace_unaligned built-in arguments
> > 
> > The first argument of the vec_replace_unaligned built-in should
> > always be
> > of type unsigned char, as specified in gcc/doc/extend.texi.
> 
> Shouldn't be "vector unsigned char" instead of "unsigned char"?
> 
> Or do I miss something?

Nope, I missed saying "vector".  Fixed.

> 
> > This patch fixes the builtin definitions and updates the test cases
> > to use
> > the correct arguments.  The original test file is renamed and a
> > second test
> > file is added for a new test case.
> > 
> > gcc/ChangeLog:
> >     * config/rs6000/rs6000-builtins.def: Rename
> >     __builtin_altivec_vreplace_un_uv2di as
> > __builtin_altivec_vreplace_un_udi
> >     __builtin_altivec_vreplace_un_uv4si as
> > __builtin_altivec_vreplace_un_usi
> >     __builtin_altivec_vreplace_un_v2df as
> > __builtin_altivec_vreplace_un_df
> >     __builtin_altivec_vreplace_un_v2di as
> > __builtin_altivec_vreplace_un_di
> >     __builtin_altivec_vreplace_un_v4sf as
> > __builtin_altivec_vreplace_un_sf
> >     __builtin_altivec_vreplace_un_v4si as
> > __builtin_altivec_vreplace_un_si.
> >     Rename VREPLACE_UN_UV2DI as VREPLACE_UN_UDI, VREPLACE_UN_UV4SI
> > as
> >     VREPLACE_UN_USI, VREPLACE_UN_V2DF as VREPLACE_UN_DF,
> >     VREPLACE_UN_V2DI as VREPLACE_UN_DI, VREPLACE_UN_V4SF as
> >     VREPLACE_UN_SF, VREPLACE_UN_V4SI as VREPLACE_UN_SI.
> >     Rename vreplace_un_v2di as vreplace_un_di, vreplace_un_v4si as
> >     vreplace_un_si, vreplace_un_v2df as vreplace_un_df,
> >     vreplace_un_v2di as vreplace_un_di, vreplace_un_v4sf as
> >     vreplace_un_sf, vreplace_un_v4si as vreplace_un_si.
> >     * config/rs6000/rs6000-c.cc (find_instance): Add case
> >     RS6000_OVLD_VEC_REPLACE_UN.
> >     * config/rs6000/rs6000-overload.def (__builtin_vec_replace_un):
> >     Fix first argument type.  Rename VREPLACE_UN_UV4SI as
> >     VREPLACE_UN_USI, VREPLACE_UN_V4SI as VREPLACE_UN_SI,
> >     VREPLACE_UN_UV2DI as VREPLACE_UN_UDI, VREPLACE_UN_V2DI as
> >     VREPLACE_UN_DI, VREPLACE_UN_V4SF as VREPLACE_UN_SF,
> >     VREPLACE_UN_V2DF as VREPLACE_UN_DF.
> >     * config/rs6000/vsx.md (REPLACE_ELT): Renamed the mode_iterator
> >     REPLACE_ELT_V.
> >     (REPLACE_ELT_char): Renamed REPLACE_ELT_V_char.
> >     (REPLACE_ELT_sh): Renamed REPLACE_ELT_V_sh.
> >     (REPLACE_ELT_max): Renamed REPLACE_ELT_V_max.
> >     (REPLACE_ELT): New mode iterator.
> >     (REPLACE_ELT_char): New mode attribute.
> >     (vreplace_un_<mode>): Change iterator and mode attribute.
> > 
> > gcc/testsuite/ChangeLog:
> >     * gcc.target/powerpc/vec-replace-word-runnable.c: Renamed
> >     vec-replace-word-runnable_1.c.
> >     * gcc.target/powerpc/vec-replace-word-runnable_1.c
> >     (dg-options): add -flax-vector-conversions.
> >     (vec_replace_unaligned) Fix first argument type.
> >     (vresult_uchar): Fix expected results.
> >     (vec_replace_unaligned): Update for loop to check uchar
> > results.
> >     Remove extra spaces in if statements. Insert missing spaces in
> >     for statements.
> >     * gcc.target/powerpc/vec-replace-word-runnable_2.c: New test
> > file.
> > ---
> >  gcc/config/rs6000/rs6000-builtins.def         |  28 +--
> >  gcc/config/rs6000/rs6000-c.cc                 |  23 +++
> >  gcc/config/rs6000/rs6000-overload.def         |  24 +--
> >  gcc/config/rs6000/vsx.md                      |  52 ++---
> >  ...nnable.c => vec-replace-word-runnable_1.c} | 177 ++++++++++--
> > ------
> >  .../powerpc/vec-replace-word-runnable_2.c     |  50 +++++
> >  6 files changed, 224 insertions(+), 130 deletions(-)
> >  rename gcc/testsuite/gcc.target/powerpc/{vec-replace-word-
> > runnable.c => vec-replace-word-runnable_1.c} (51%)
> >  create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-replace-
> > word-runnable_2.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index c0ef717161f..58c53cd462e 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -3397,26 +3397,26 @@
> >    const vull __builtin_altivec_vpextd (vull, vull);
> >      VPEXTD vpextd {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_uv2di (vull, unsigned
> > long long, \
> > -                                                 const int<4>);
> > -    VREPLACE_UN_UV2DI vreplace_un_v2di {}
> > +  const vuc __builtin_altivec_vreplace_un_udi (vuc, unsigned long
> > long, \
> > +                                               const int<4>);
> > +    VREPLACE_UN_UDI vreplace_un_di {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_uv4si (vui, unsigned
> > int, \
> > -                                                 const int<4>);
> > -    VREPLACE_UN_UV4SI vreplace_un_v4si {}
> > +  const vuc __builtin_altivec_vreplace_un_usi (vuc, unsigned int,
> > \
> > +                                               const int<4>);
> > +    VREPLACE_UN_USI vreplace_un_si {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_v2df (vd, double, const
> > int<4>);
> > -    VREPLACE_UN_V2DF vreplace_un_v2df {}
> > +  const vuc __builtin_altivec_vreplace_un_df (vuc, double, const
> > int<4>);
> > +    VREPLACE_UN_DF vreplace_un_df {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_v2di (vsll, signed long
> > long, \
> > +  const vuc __builtin_altivec_vreplace_un_di (vuc, signed long
> > long, \
> >                                                  const int<4>);
> > -    VREPLACE_UN_V2DI vreplace_un_v2di {}
> > +    VREPLACE_UN_DI vreplace_un_di {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_v4sf (vf, float, const
> > int<4>);
> > -    VREPLACE_UN_V4SF vreplace_un_v4sf {}
> > +  const vuc __builtin_altivec_vreplace_un_sf (vuc, float, const
> > int<4>);
> > +    VREPLACE_UN_SF vreplace_un_sf {}
> >  
> > -  const vuc __builtin_altivec_vreplace_un_v4si (vsi, signed int,
> > const int<4>);
> > -    VREPLACE_UN_V4SI vreplace_un_v4si {}
> > +  const vuc __builtin_altivec_vreplace_un_si (vuc, signed int,
> > const int<4>);
> > +    VREPLACE_UN_SI vreplace_un_si {}
> >  
> >    const vull __builtin_altivec_vreplace_uv2di (vull, unsigned long
> > long, \
> >                                                 const int<1>);
> > diff --git a/gcc/config/rs6000/rs6000-c.cc
> > b/gcc/config/rs6000/rs6000-c.cc
> > index 350987b851b..63ab360d352 100644
> > --- a/gcc/config/rs6000/rs6000-c.cc
> > +++ b/gcc/config/rs6000/rs6000-c.cc
> > @@ -1968,6 +1968,29 @@ altivec_resolve_overloaded_builtin
> > (location_t loc, tree fndecl,
> >           instance_code = RS6000_BIF_VSIEDP;
> >       }
> >  
> > +   tree call = find_instance (&unsupported_builtin, &instance,
> > +                              instance_code, fcode, types, args,
> > nargs);
> > +   if (call != error_mark_node)
> > +     return call;
> > +   break;
> > +      }
> > +    case RS6000_OVLD_VEC_REPLACE_UN:
> > +      {
> > +   machine_mode arg2_mode = TYPE_MODE (types[1]);
> > +
> > +   if (arg2_mode == SImode)
> > +     /* Signed and unsigned are handled the same.  */
> > +     instance_code = RS6000_BIF_VREPLACE_UN_USI;
> > +   else if (arg2_mode == SFmode)
> > +     instance_code = RS6000_BIF_VREPLACE_UN_SF;
> > +   else if (arg2_mode == DImode)
> > +     /* Signed and unsigned are handled the same.  */
> > +     instance_code = RS6000_BIF_VREPLACE_UN_UDI;
> > +   else if (arg2_mode == DFmode)
> > +     instance_code = RS6000_BIF_VREPLACE_UN_DF;
> > +   else
> > +     break;
> > +
> >     tree call = find_instance (&unsupported_builtin, &instance,
> >                                instance_code, fcode, types, args,
> > nargs);
> >     if (call != error_mark_node)
> > diff --git a/gcc/config/rs6000/rs6000-overload.def
> > b/gcc/config/rs6000/rs6000-overload.def
> > index 470d718efde..b83946f5ad8 100644
> > --- a/gcc/config/rs6000/rs6000-overload.def
> > +++ b/gcc/config/rs6000/rs6000-overload.def
> > @@ -3059,18 +3059,18 @@
> >      VREPLACE_ELT_V2DF
> >  
> >  [VEC_REPLACE_UN, vec_replace_unaligned, __builtin_vec_replace_un]
> > -  vuc __builtin_vec_replace_un (vui, unsigned int, const int);
> > -    VREPLACE_UN_UV4SI
> > -  vuc __builtin_vec_replace_un (vsi, signed int, const int);
> > -    VREPLACE_UN_V4SI
> > -  vuc __builtin_vec_replace_un (vull, unsigned long long, const
> > int);
> > -    VREPLACE_UN_UV2DI
> > -  vuc __builtin_vec_replace_un (vsll, signed long long, const
> > int);
> > -    VREPLACE_UN_V2DI
> > -  vuc __builtin_vec_replace_un (vf, float, const int);
> > -    VREPLACE_UN_V4SF
> > -  vuc __builtin_vec_replace_un (vd, double, const int);
> > -    VREPLACE_UN_V2DF
> > +  vuc __builtin_vec_replace_un (vuc, unsigned int, const int);
> > +    VREPLACE_UN_USI
> > +  vuc __builtin_vec_replace_un (vuc, signed int, const int);
> > +    VREPLACE_UN_SI
> > +  vuc __builtin_vec_replace_un (vuc, unsigned long long, const
> > int);
> > +    VREPLACE_UN_UDI
> > +  vuc __builtin_vec_replace_un (vuc, signed long long, const int);
> > +    VREPLACE_UN_DI
> > +  vuc __builtin_vec_replace_un (vuc, float, const int);
> > +    VREPLACE_UN_SF
> > +  vuc __builtin_vec_replace_un (vuc, double, const int);
> > +    VREPLACE_UN_DF
> >  
> >  [VEC_REVB, vec_revb, __builtin_vec_revb]
> >    vss __builtin_vec_revb (vss);
> > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> > index 0c269e4e8d9..d76f9f4864e 100644
> > --- a/gcc/config/rs6000/vsx.md
> > +++ b/gcc/config/rs6000/vsx.md
> > @@ -381,13 +381,20 @@ (define_int_attr
> > xvcvbf16       [(UNSPEC_VSX_XVCVSPBF16 "xvcvspbf16")
> >  (define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
> >  
> >  ;; Vector extract_elt iterator/attr for 32-bit and 64-bit elements

Missed this earlier, I think the comment needs fixing as well since we
now have both vector and scalar now.    Changed to:

;; Vector and scalar extract_elt iterator/attr for 32-bit and 64-bit
elements

> > -(define_mode_iterator REPLACE_ELT [V4SI V4SF V2DI V2DF])
> > -(define_mode_attr REPLACE_ELT_char [(V4SI "w") (V4SF "w")
> > -                               (V2DI  "d") (V2DF "d")])
> > -(define_mode_attr REPLACE_ELT_sh [(V4SI "2") (V4SF "2")
> > -                             (V2DI  "3") (V2DF "3")])
> > -(define_mode_attr REPLACE_ELT_max [(V4SI "12") (V4SF "12")
> > -                              (V2DI  "8") (V2DF "8")])
> > +(define_mode_iterator REPLACE_ELT_V [V4SI V4SF V2DI V2DF])
> > +(define_mode_attr REPLACE_ELT_V_char [(V4SI "w") (V4SF "w")
> > +                                 (V2DI  "d") (V2DF "d")])
> > +(define_mode_attr REPLACE_ELT_V_sh [(V4SI "2") (V4SF "2")
> > +                               (V2DI  "3") (V2DF "3")])
> > +(define_mode_attr REPLACE_ELT_V_max [(V4SI "12") (V4SF "12")
> > +                                (V2DI  "8") (V2DF "8")])
> > +
> > +;; vector replace unaligned modes
> > +(define_mode_iterator REPLACE_ELT [SI SF DI DF])
> > +(define_mode_attr REPLACE_ELT_char [(SI "w")
> > +                               (SF "w")
> > +                               (DI "d")
> > +                               (DF "d")])
> 
> Sorry if I didn't describe it clearly, I actually adviced that:

I think maybe it got lost in the first attempt to change this which
didn't work.  Fixed it.
> 
> (define_mode_iterator REPLACE_ELT_V [V4SI V4SF V2DI V2DF])
> (define_mode_iterator REPLACE_ELT [SI SF DI DF])
> (define_mode_attr REPLACE_ELT_char [(V4SI "w") (V4SF "w")
>                                   (V2DI  "d") (V2DF "d")
>                                   (SI "w") (SF "w")
>                                   (DI  "d") (DF "d")])
> 
Having a single mode_attr for scalar and vector is a bit cleaner.


> It leaves REPLACE_ELT_sh and REPLACE_ELT_max unchanged, otherwise
> we have to update the preparation statements like what you did in
> this patch, IMHO both aligned and unaligned versions are replacing
> things with the element size, it's not quite meaningful to further
> distinguish the mode attributes.

OK, so I reverted REPLACE_ELT_V_sh back to REPLACE_ELT_sh and
REPLACE_ELT_V_max back to REPLACE_ELT_max.  Updated ChangeLog.

> 
> >  
> >  ;; Like VM2 in altivec.md, just do char, short, int, long, float
> > and double
> >  (define_mode_iterator VM3 [V4SI
> > @@ -4183,21 +4190,22 @@ (define_insn "vinsertgr_internal_<mode>"
> >   [(set_attr "type" "vecsimple")])
> >  
> >  (define_expand "vreplace_elt_<mode>"
> > -  [(set (match_operand:REPLACE_ELT 0 "register_operand")
> > -  (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1
> > "register_operand")
> > -                  (match_operand:<VEC_base> 2 "register_operand")
> > -                  (match_operand:QI 3 "const_0_to_3_operand")]
> > -                 UNSPEC_REPLACE_ELT))]
> > +  [(set (match_operand:REPLACE_ELT_V 0 "register_operand")
> > +  (unspec:REPLACE_ELT_V [(match_operand:REPLACE_ELT_V 1
> > "register_operand")
> > +                    (match_operand:<VEC_base> 2
> > "register_operand")
> > +                    (match_operand:QI 3 "const_0_to_3_operand")]
> > +                   UNSPEC_REPLACE_ELT))]
> >   "TARGET_POWER10"
> >  {
> >     int index;
> >     /* Immediate value is the word index, convert to byte index and
> > adjust for
> >        Endianness if needed.  */
> >     if (BYTES_BIG_ENDIAN)
> > -     index = INTVAL (operands[3]) << <REPLACE_ELT_sh>;
> > +     index = INTVAL (operands[3]) << <REPLACE_ELT_V_sh>;
> >  
> >     else
> > -     index = <REPLACE_ELT_max> - (INTVAL (operands[3]) <<
> > <REPLACE_ELT_sh>);
> > +     index = <REPLACE_ELT_V_max>
> > +        - (INTVAL (operands[3]) << <REPLACE_ELT_V_sh>);
> >  
> >     emit_insn (gen_vreplace_elt_<mode>_inst (operands[0],
> > operands[1],
> >                                         operands[2],
> > @@ -4207,19 +4215,19 @@ (define_expand "vreplace_elt_<mode>"
> >  [(set_attr "type" "vecsimple")])
> >  
> >  (define_insn "vreplace_elt_<mode>_inst"
> > - [(set (match_operand:REPLACE_ELT 0 "register_operand" "=v")
> > -  (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1
> > "register_operand" "0")
> > -                  (match_operand:<VEC_base> 2 "register_operand"
> > "r")
> > -                  (match_operand:QI 3 "const_0_to_12_operand"
> > "n")]
> > -                 UNSPEC_REPLACE_ELT))]
> > + [(set (match_operand:REPLACE_ELT_V 0 "register_operand" "=v")
> > +  (unspec:REPLACE_ELT_V [(match_operand:REPLACE_ELT_V 1
> > "register_operand" "0")
> > +                    (match_operand:<VEC_base> 2 "register_operand"
> > "r")
> > +                    (match_operand:QI 3 "const_0_to_12_operand"
> > "n")]
> > +                   UNSPEC_REPLACE_ELT))]
> >   "TARGET_POWER10"
> > - "vins<REPLACE_ELT_char> %0,%2,%3"
> > + "vins<REPLACE_ELT_V_char> %0,%2,%3"
> >   [(set_attr "type" "vecsimple")])
> >  
> >  (define_insn "vreplace_un_<mode>"
> >   [(set (match_operand:V16QI 0 "register_operand" "=v")
> > -  (unspec:V16QI [(match_operand:REPLACE_ELT 1 "register_operand"
> > "0")
> > -                 (match_operand:<VEC_base> 2 "register_operand"
> > "r")
> > +  (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "0")
> > +            (match_operand:REPLACE_ELT 2 "register_operand" "r")
> >              (match_operand:QI 3 "const_0_to_12_operand" "n")]
> >             UNSPEC_REPLACE_UN))]
> >   "TARGET_POWER10"
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-
> > runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-
> > runnable_1.c
> > similarity index 51%
> > rename from gcc/testsuite/gcc.target/powerpc/vec-replace-word-
> > runnable.c
> > rename to gcc/testsuite/gcc.target/powerpc/vec-replace-word-
> > runnable_1.c
> > index 27318822871..7e7485ca371 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-
> > runnable_1.c
> > @@ -1,5 +1,4 @@
> >  /* { dg-do run { target { power10_hw } } } */
> > -/* { dg-do link { target { ! power10_hw } } } */
> 
> As your explanation that this "dg-do link" has existed for a long
> time,
> I'd a check and found it's from r11-4397-g8d8fef197114a9 "[RS6000]
> Link
> power10 testcases", I think it wanted to test assembler and linker
> when
> there are few actual P10 hw (imagining at the early stage of P10
> support).
> 
> Since there are a few test cases like this, I'm inclined to just
> leave it

OK, put it back in.

                       Carl 


Reply via email to