On Fri, Sep 17, 2021 at 08:35:57AM +0200, Uros Bizjak via Gcc-patches wrote:
> > > On Wed, Sep 15, 2021 at 10:10 AM <lili....@intel.com> wrote:
> > > >
> > > > From: "H.J. Lu" <hjl.to...@gmail.com>
> > > >
> > > > Check TARGET_USE_VECTOR_FP_CONVERTS or
> > > TARGET_USE_VECTOR_CONVERTS when
> > > > handling avx_partial_xmm_update attribute.  Don't convert AVX partial
> > > > XMM register update if vector packed SSE conversion should be used.
> > > >
> > > > gcc/
> > > >
> > > >         PR target/101900
> > > >         * config/i386/i386-features.c (remove_partial_avx_dependency):
> > > >         Check TARGET_USE_VECTOR_FP_CONVERTS and
> > > TARGET_USE_VECTOR_CONVERTS
> > > >         before generating vxorps.
> > > >
> > > > gcc/
> > > >
> > > >         PR target/101900
> > > >         * testsuite/gcc.target/i386/pr101900-1.c: New test.
> > > >         * testsuite/gcc.target/i386/pr101900-2.c: Likewise.
> > > >         * testsuite/gcc.target/i386/pr101900-3.c: Likewise.
> > > > ---
> > > >  gcc/config/i386/i386-features.c            | 21 ++++++++++++++++++---
> > > >  gcc/testsuite/gcc.target/i386/pr101900-1.c | 18 ++++++++++++++++++
> > > > gcc/testsuite/gcc.target/i386/pr101900-2.c | 18 ++++++++++++++++++
> > > > gcc/testsuite/gcc.target/i386/pr101900-3.c | 19 +++++++++++++++++++
> > > >  4 files changed, 73 insertions(+), 3 deletions(-)  create mode 100644
> > > > gcc/testsuite/gcc.target/i386/pr101900-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101900-2.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101900-3.c
> > > >
> > > > diff --git a/gcc/config/i386/i386-features.c
> > > > b/gcc/config/i386/i386-features.c index 5a99ea7c046..ae5ea02a002
> > > > 100644
> > > > --- a/gcc/config/i386/i386-features.c
> > > > +++ b/gcc/config/i386/i386-features.c
> > > > @@ -2210,15 +2210,30 @@ remove_partial_avx_dependency (void)
> > > >               != AVX_PARTIAL_XMM_UPDATE_TRUE)
> > > >             continue;
> > > >
> > > > -         if (!v4sf_const0)
> > > > -           v4sf_const0 = gen_reg_rtx (V4SFmode);
> > > > -
> > > >           /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF,
> > > >              SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and
> > > >              vec_merge with subreg.  */
> > > >           rtx src = SET_SRC (set);
> > > >           rtx dest = SET_DEST (set);
> > > >           machine_mode dest_mode = GET_MODE (dest);
> > > > +         machine_mode src_mode;
> > > > +
> > > > +         if (TARGET_USE_VECTOR_FP_CONVERTS)
> > > > +           {
> > > > +             src_mode = GET_MODE (XEXP (src, 0));
> > > > +             if (src_mode == E_SFmode || src_mode == E_DFmode)
> > > > +               continue;
> > > > +           }
> > > > +
> > > > +         if (TARGET_USE_VECTOR_CONVERTS)
> > > > +           {
> > > > +             src_mode = GET_MODE (XEXP (src, 0));
> > > > +             if (src_mode == E_SImode || src_mode == E_DImode)
> > > > +               continue;
> > > > +           }
> > > > +
> > > > +         if (!v4sf_const0)
> > > > +           v4sf_const0 = gen_reg_rtx (V4SFmode);
> > >
> > > Please better move initialization of src_mode to the top of the new hunk, 
> > > like:
> > >
> > > machine_mode src_mode = GET_MODE (XEXP (src, 0)); switch (src_mode) {
> > >   case E_SFmode:
> > >   case E_DFmode:
> > >     if (TARGET_USE_VECTOR_FP_CONVERTS)
> > >       continue;
> > >     break;
> > >   case E_SImode:
> > >   case E_DImode:
> > >     if (TARGET_USE_VECTOR_CONVERTS)
> > >       continue;
> > >     break;
> > >   default:
> > >     break;
> > > }
> > >
> > > or something like the above.
> >
> > Done, thanks for your good advice, I also rebased patch 4/4, since it is 
> > based on patch 3/4.

The above change broke
+FAIL: gcc.target/i386/avx512f-vscalefpd-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx512f-vscalefpd-2.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx512f-vscalefpd-2.c compilation failed to 
produce executable
+FAIL: gcc.target/i386/avx512f-vscalefps-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx512f-vscalefps-2.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx512f-vscalefps-2.c compilation failed to 
produce executable
+FAIL: gcc.target/i386/avx512f-vscalefss-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx512f-vscalefss-2.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx512f-vscalefss-2.c compilation failed to 
produce executable
+FAIL: gcc.target/i386/avx512vl-vscalefpd-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx512vl-vscalefpd-2.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx512vl-vscalefpd-2.c compilation failed to 
produce executable
+FAIL: gcc.target/i386/avx512vl-vscalefps-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx512vl-vscalefps-2.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx512vl-vscalefps-2.c compilation failed to 
produce executable
when configured with --enable-checking=yes,rtl,extra, the error is:
during RTL pass: rpad
/home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/avx512f-vscalefpd-2.c:57:1: 
internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'E' 
(rtx unspec) in remove_partial_avx_dependency, at 
config/i386/i386-features.c:2219
0x77541d rtl_check_failed_type2(rtx_def const*, int, int, int, char const*, 
int, char const*)
        ../../gcc/rtl.c:898
0x84e731 remove_partial_avx_dependency
        ../../gcc/config/i386/i386-features.c:2219
0x84e731 execute
        ../../gcc/config/i386/i386-features.c:2389
This is on
2219              machine_mode src_mode = GET_MODE (XEXP (src, 0));
and src is:
(unspec:DF [
        (mem:DF (plus:DI (reg/f:DI 149)
                (reg:DI 103 [ ivtmp.65 ])) [3 MEM[(double *)&src2 + ivtmp.65_65 
* 1]+0 S8 A64])
        (const_int 9 [0x9])
    ] UNSPEC_ROUND)
- whole insn
(insn 55 54 56 5 (set (reg:DF 99 [ _50 ])
        (unspec:DF [
                (mem:DF (plus:DI (reg/f:DI 149)
                        (reg:DI 103 [ ivtmp.65 ])) [3 MEM[(double *)&src2 + 
ivtmp.65_65 * 1]+0 S8 A64])
                (const_int 9 [0x9])
            ] UNSPEC_ROUND)) 
"/home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/avx512f-vscalefpd-2.c":19:28 
1076 {sse4_1_rounddf2}
     (nil))
so XEXP (src, 0) can't be used in that case.
Looking at insns with avx_partial_xmm_update attribute, it seems
src is either FLOAT_EXTEND/FLOAT_TRUNCATE/FLOAT/UNSIGNED_FLOAT and
in that case it looks like a conversion and has different modes,
or it is UNSPEC (UNSPEC_{RCP,RSQRT,ROUND}) or SQRT and in that case it doesn't.

        Jakub

Reply via email to