On Fri, May 22, 2020 at 2:41 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Fri, May 22, 2020 at 6:55 AM Hongtao Liu <crazy...@gmail.com> wrote:
> >
> > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > >
> > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu <crazy...@gmail.com> wrote:
> > > >
> > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > > >
> > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu <crazy...@gmail.com> 
> > > > > wrote:
> > > > > >
> > > > > > Hi:
> > > > > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >         PR target/92658
> > > > > >         * config/i386/sse.md
> > > > > >         (trunc<pmov_src_lower><mode>2, truncv32hiv32qi2,
> > > > > >         trunc<ssedoublemodelower><mode>2): New expander.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >         * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > > >         * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > > >         * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > > > >
> > > > > There are more conversions to be added. There are:
> > > > >
> > > > > V2DImode to V2QImode, V2HImode, V2SImode
> > > > > V4DImode to V4QImode, V4HImode, V4SImode
> > > > > V8DImode to V8QImode, V8HImode, V8SImode
> > > > >
> > > > > V4SImode to V4QImode, V4HImode
> > > > > V8SImode to V8QImode, V8HImode
> > > > > V16SImode to V16QImode, V16HImode
> > > > >
> > > > > V8HImode to V8QImode
> > > > > V16HImode to V16QImode
> > > > > V32HImode to V32QImode
> > > > >
> > > > All of them are added
> > > >
> > > > Vectorization failure: (Add xfail in testcase for them since they need
> > > > generic part)
> > > > V2DImode to V2QImode, V2HImode
> > > > V4DImode to V4QImode, V4HImode
> > > > V8DImode to V8QImode
> > > >
> > > > V4SImode to V4QImode, V4HImode
> > > > V8SImode to V8QImode
> > > >
> > > > V8HImode to V8QImode
> > > >
> > > > Vectorization success:
> > > > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> > > > V4DImode to V4SImode
> > > > V8DImode to V8HImode, V8SImode
> > > >
> > > > V8SImode to V8HImode
> > > > V16SImode to V16QImode, V16HImode
> > > >
> > > > V32HImode to V32QImode
> > > > V16HImode to V16HImode.
> > > >
> > > >
> > > > > Uros.
> > > >
> > > > Update patch.
> > > > Regression test on i386/x86-64 backend is ok, bootstrap is ok.
> > > >
> > > > gcc/ChangeLog:
> > > >         PR target/92658
> > > >         * config/i386/sse.md
> > > >         (trunc<pmov_src_lower><mode>2): New expander
> > > >         (truncv32hiv32qi2): Ditto.
> > > >         (trunc<ssedoublemodelower><mode>2): Ditto.
> > > >         (trunc<mode><pmov_dst_3>2): Ditto.
> > > >         (trunc<mode><pmov_dst_mode_4>2): Ditto.
> > > >         (truncv2div2si2): Ditto.
> > > >         (truncv8div8qi2): Ditto.
> > > >         (avx512f_<code>v8div16qi2): Renaming
> > > >         from *avx512f_<code>v8div16qi2.
> > > >         (avx512vl_<code>v2div2si): Renaming
> > > >         from *avx512vl_<code>v2div2si2.
> > > >         (avx512vl_<code><mode>v2<ssecakarnum>qi2): Renaming
> > > >         from *avx512vl_<code><mode>v<ssescalarnum>qi2.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >         * gcc.target/i386/pr92658-avx512f.c: New test.
> > > >         * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > >         * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > >
> > >
> > > +  rtx op = simplify_subreg (V16QImode, operands[0], <pmov_dst_3>mode, 0);
> > > +  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);
> > >
> > > You should use simplify_gen_subreg, without null op fixup:
> > >
> > > operands[0] = simplify_gen_subreg (V16QImode, operands[0], 
> > > <pmov_dst_3>mode, 0);
> > >
> > Changed.
> > > +  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"
> > >
> > > Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
> > > without this flag.
> > >
> > Changed.
>
> +(define_expand "truncv8div8qi2"
> +  [(set (match_operand:V8QI 0 "register_operand")
> +    (truncate:V8QI
> +        (match_operand:V8DI 1 "register_operand")))]
> +  "TARGET_AVX512F && TARGET_MMX_WITH_SSE"
> +{
> +  operands[0] = simplify_gen_subreg (V16QImode, operands[0], V8QImode, 0);
> +  emit_insn (gen_avx512f_truncatev8div16qi2 (operands[0], operands[1]));
> +  DONE;
> +})
>
> You left one here.
>
> +/* { dg-final { scan-assembler-times "vpmovqd" 2 { target { ! ia32 } } } } */
>
> Target selector shouldn't be needed here.
>
> The patch is OK with the above changes.
>
Changed.
> On a related note, it looks that pmov stores are modelled in a wrong
> way. For example, this pattern;
>
> (define_insn "*avx512f_<code>v8div16qi2_store"
>   [(set (match_operand:V16QI 0 "memory_operand" "=m")
>     (vec_concat:V16QI
>       (any_truncate:V8QI
>         (match_operand:V8DI 1 "register_operand" "v"))
>       (vec_select:V8QI
>         (match_dup 0)
>         (parallel [(const_int 8) (const_int 9)
>                (const_int 10) (const_int 11)
>                (const_int 12) (const_int 13)
>                (const_int 14) (const_int 15)]))))]
>
> models the store in 128bit mode, but according to ISA, it stores in 16bit 
> mode.
>
according to ISA, it stores in 64bit mode
vpmovqb xmm1/m64 {k1}{z}, zmm2.

memory_operand is 128bit but upper 64bit is not changed which means it
store only lower 64bits, just same meaning to ISA.

> Uros.



-- 
BR,
Hongtao

Reply via email to