On Fri, May 22, 2020 at 2:41 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Fri, May 22, 2020 at 6:55 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > > > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > > > > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu <crazy...@gmail.com> > > > > > wrote: > > > > > > > > > > > > Hi: > > > > > > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > > > > > > > > > > > gcc/ChangeLog: > > > > > > PR target/92658 > > > > > > * config/i386/sse.md > > > > > > (trunc<pmov_src_lower><mode>2, truncv32hiv32qi2, > > > > > > trunc<ssedoublemodelower><mode>2): New expander. > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/i386/pr92658-avx512f.c: New test. > > > > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto. > > > > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto. > > > > > > > > > > There are more conversions to be added. There are: > > > > > > > > > > V2DImode to V2QImode, V2HImode, V2SImode > > > > > V4DImode to V4QImode, V4HImode, V4SImode > > > > > V8DImode to V8QImode, V8HImode, V8SImode > > > > > > > > > > V4SImode to V4QImode, V4HImode > > > > > V8SImode to V8QImode, V8HImode > > > > > V16SImode to V16QImode, V16HImode > > > > > > > > > > V8HImode to V8QImode > > > > > V16HImode to V16QImode > > > > > V32HImode to V32QImode > > > > > > > > > All of them are added > > > > > > > > Vectorization failure: (Add xfail in testcase for them since they need > > > > generic part) > > > > V2DImode to V2QImode, V2HImode > > > > V4DImode to V4QImode, V4HImode > > > > V8DImode to V8QImode > > > > > > > > V4SImode to V4QImode, V4HImode > > > > V8SImode to V8QImode > > > > > > > > V8HImode to V8QImode > > > > > > > > Vectorization success: > > > > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE) > > > > V4DImode to V4SImode > > > > V8DImode to V8HImode, V8SImode > > > > > > > > V8SImode to V8HImode > > > > V16SImode to V16QImode, V16HImode > > > > > > > > V32HImode to V32QImode > > > > V16HImode to V16HImode. > > > > > > > > > > > > > Uros. > > > > > > > > Update patch. > > > > Regression test on i386/x86-64 backend is ok, bootstrap is ok. > > > > > > > > gcc/ChangeLog: > > > > PR target/92658 > > > > * config/i386/sse.md > > > > (trunc<pmov_src_lower><mode>2): New expander > > > > (truncv32hiv32qi2): Ditto. > > > > (trunc<ssedoublemodelower><mode>2): Ditto. > > > > (trunc<mode><pmov_dst_3>2): Ditto. > > > > (trunc<mode><pmov_dst_mode_4>2): Ditto. > > > > (truncv2div2si2): Ditto. > > > > (truncv8div8qi2): Ditto. > > > > (avx512f_<code>v8div16qi2): Renaming > > > > from *avx512f_<code>v8div16qi2. > > > > (avx512vl_<code>v2div2si): Renaming > > > > from *avx512vl_<code>v2div2si2. > > > > (avx512vl_<code><mode>v2<ssecakarnum>qi2): Renaming > > > > from *avx512vl_<code><mode>v<ssescalarnum>qi2. > > > > > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/pr92658-avx512f.c: New test. > > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto. > > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto. > > > > > > > > > + rtx op = simplify_subreg (V16QImode, operands[0], <pmov_dst_3>mode, 0); > > > + operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0); > > > > > > You should use simplify_gen_subreg, without null op fixup: > > > > > > operands[0] = simplify_gen_subreg (V16QImode, operands[0], > > > <pmov_dst_3>mode, 0); > > > > > Changed. > > > + "TARGET_MMX_WITH_SSE && TARGET_AVX512VL" > > > > > > Do you really need TARGET_MMX_WITH_SSE? Narrow modes are active even > > > without this flag. > > > > > Changed. > > +(define_expand "truncv8div8qi2" > + [(set (match_operand:V8QI 0 "register_operand") > + (truncate:V8QI > + (match_operand:V8DI 1 "register_operand")))] > + "TARGET_AVX512F && TARGET_MMX_WITH_SSE" > +{ > + operands[0] = simplify_gen_subreg (V16QImode, operands[0], V8QImode, 0); > + emit_insn (gen_avx512f_truncatev8div16qi2 (operands[0], operands[1])); > + DONE; > +}) > > You left one here. > > +/* { dg-final { scan-assembler-times "vpmovqd" 2 { target { ! ia32 } } } } */ > > Target selector shouldn't be needed here. > > The patch is OK with the above changes. > Changed. > On a related note, it looks that pmov stores are modelled in a wrong > way. For example, this pattern; > > (define_insn "*avx512f_<code>v8div16qi2_store" > [(set (match_operand:V16QI 0 "memory_operand" "=m") > (vec_concat:V16QI > (any_truncate:V8QI > (match_operand:V8DI 1 "register_operand" "v")) > (vec_select:V8QI > (match_dup 0) > (parallel [(const_int 8) (const_int 9) > (const_int 10) (const_int 11) > (const_int 12) (const_int 13) > (const_int 14) (const_int 15)]))))] > > models the store in 128bit mode, but according to ISA, it stores in 16bit > mode. > according to ISA, it stores in 64bit mode vpmovqb xmm1/m64 {k1}{z}, zmm2.
memory_operand is 128bit but upper 64bit is not changed which means it store only lower 64bits, just same meaning to ISA. > Uros. -- BR, Hongtao