On 10/21/18, Uros Bizjak <ubiz...@gmail.com> wrote: > On Sat, Oct 20, 2018 at 11:47 PM H.J. Lu <hjl.to...@gmail.com> wrote: >> >> On 10/20/18, Uros Bizjak <ubiz...@gmail.com> wrote: >> > On Fri, Oct 19, 2018 at 11:08 PM H.J. Lu <hjl.to...@gmail.com> wrote: >> >> >> >> Many AVX512 vector operations can broadcast from a scalar memory >> >> source. >> >> This patch enables memory broadcast for FP mul operations. >> >> >> >> gcc/ >> >> >> >> PR target/72782 >> >> * config/i386/sse.md (*mul<mode>3<mask_name>_bcst_1): New. >> >> (*mul<mode>3<mask_name>_bcst_2): Likewise. >> >> >> >> gcc/testsuite/ >> >> >> >> PR target/72782 >> >> * gcc.target/i386/avx512f-mul-df-zmm-1.c: New test. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-1.c: Likewise. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-2.c: Likewise. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-3.c: Likewise. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-4.c: Likewise. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-5.c: Likewise. >> >> * gcc.target/i386/avx512f-mul-sf-zmm-6.c: Likewise. >> >> * gcc.target/i386/avx512vl-mul-sf-xmm-1.c: Likewise. >> >> * gcc.target/i386/avx512vl-mul-sf-ymm-1.c: Likewise. >> >> --- >> >> gcc/config/i386/sse.md | 24 >> >> +++++++++++++++++++ >> >> .../gcc.target/i386/avx512f-mul-df-zmm-1.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-1.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-2.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-3.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-4.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-5.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512f-mul-sf-zmm-6.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512vl-mul-sf-xmm-1.c | 12 ++++++++++ >> >> .../gcc.target/i386/avx512vl-mul-sf-ymm-1.c | 12 ++++++++++ >> >> 10 files changed, 132 insertions(+) >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-df-zmm-1.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-1.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-2.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-3.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-4.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-5.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512f-mul-sf-zmm-6.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512vl-mul-sf-xmm-1.c >> >> create mode 100644 >> >> gcc/testsuite/gcc.target/i386/avx512vl-mul-sf-ymm-1.c >> >> >> >> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md >> >> index 411c78ae8d3..a73659e6bd2 100644 >> >> --- a/gcc/config/i386/sse.md >> >> +++ b/gcc/config/i386/sse.md >> >> @@ -1754,6 +1754,30 @@ >> >> (set_attr "btver2_decode" "direct,double") >> >> (set_attr "mode" "<MODE>")]) >> >> >> >> +(define_insn "*mul<mode>3<mask_name>_bcst_1" >> >> + [(set (match_operand:VF_AVX512 0 "register_operand" "=v") >> >> + (mult:VF_AVX512 >> >> + (match_operand:VF_AVX512 1 "register_operand" "v") >> >> + (vec_duplicate:VF_AVX512 >> >> + (match_operand:<ssescalarmode> 2 "memory_operand" >> >> "m"))))] >> >> + "TARGET_AVX512F && <mask_mode512bit_condition>" >> >> + "vmul<ssemodesuffix>\t{%2<avx512bcst>, %1, >> >> %0<mask_operand3>|%0<mask_operand3>, %1, %2<<avx512bcst>>}" >> >> + [(set_attr "prefix" "evex") >> >> + (set_attr "type" "ssemul") >> >> + (set_attr "mode" "<MODE>")]) >> >> + >> >> +(define_insn "*mul<mode>3<mask_name>_bcst_2" >> >> + [(set (match_operand:VF_AVX512 0 "register_operand" "=v") >> >> + (mult:VF_AVX512 >> >> + (vec_duplicate:VF_AVX512 >> >> + (match_operand:<ssescalarmode> 1 "memory_operand" "m")) >> >> + (match_operand:VF_AVX512 2 "register_operand" "v")))] >> >> + "TARGET_AVX512F && <mask_mode512bit_condition>" >> >> + "vmul<ssemodesuffix>\t{%1<avx512bcst>, %2, >> >> %0<mask_operand3>|%0<mask_operand3>, %2, %1<<avx512bcst>>}" >> >> + [(set_attr "prefix" "evex") >> >> + (set_attr "type" "ssemul") >> >> + (set_attr "mode" "<MODE>")]) >> > >> > Do we really need two patterns here? IIRC, the compiler canonicalizes >> > commutative binops so that they have memory operand in the second >> > place. We have vec_duplicate here, so this may not be the case, but >> > please investigate if we really need two patterns for commutative >> > binops. >> > >> >> Only one pattern is needed. For >> >> (set (reg:V16SF 89) (vec_duplicate:V16SF (reg:SF 91))) >> (set (reg:V16SF 95) (mult:V16SF (reg:V16SF 87) (reg:V16SF 89))) >> >> combiner prefers >> >> (set (reg:V16SF 95) >> (mult:V16SF >> (vec_duplicate:V16SF (reg:SF 91)) >> (reg:V16SF 87))) >> >> instead of >> >> (set (reg:V16SF 95) >> (mult:V16SF >> (reg:V16SF 87) >> (vec_duplicate:V16SF (reg:SF 91)))) >> >> commutation is performed at >> >> (set (reg:V16SF 95) (mult:V16SF (reg:V16SF 87) (reg:V16SF 89))) >> >> Here is the updated patch. OK for trunk? > > No need for a big comment, this is due to RTX operator precedence in > commutative operators.. > > OK with the above change.
Checked in. > Please also remove plus part from > > *<plusminus_insn><mode>3<mask_name>_bcst_1 > > and rename it together with > > *add<mode>3<mask_name>_bcst_2 > > to ..._bcst, without suffix. > This is the patch I am checking in. -- H.J.
From 3951f97a3cbcd27ae268eb9d76ca261f39d0fc74 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.to...@gmail.com> Date: Sun, 21 Oct 2018 03:40:35 -0700 Subject: [PATCH] i386: Update FP add/sub with AVX512 memory broadcast * config/i386/sse.md (*<plusminus_insn><mode>3<mask_name>_bcst_1): Remove plus. Renamed to ... (*sub<mode>3<mask_name>_bcst): This. (*add<mode>3<mask_name>_bcst_2): Renamede to ... (*add<mode>3<mask_name>_bcst): This. --- gcc/config/i386/sse.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f29ee9df94d..520afc56272 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1684,21 +1684,21 @@ (set_attr "prefix" "<mask_prefix3>") (set_attr "mode" "<MODE>")]) -(define_insn "*<plusminus_insn><mode>3<mask_name>_bcst_1" +(define_insn "*sub<mode>3<mask_name>_bcst" [(set (match_operand:VF_AVX512 0 "register_operand" "=v") - (plusminus:VF_AVX512 + (minus:VF_AVX512 (match_operand:VF_AVX512 1 "register_operand" "v") (vec_duplicate:VF_AVX512 (match_operand:<ssescalarmode> 2 "memory_operand" "m"))))] "TARGET_AVX512F - && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands) + && ix86_binary_operator_ok (MINUS, <MODE>mode, operands) && <mask_mode512bit_condition>" - "v<plusminus_mnemonic><ssemodesuffix>\t{%2<avx512bcst>, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2<avx512bcst>}" + "vsub<ssemodesuffix>\t{%2<avx512bcst>, %1, %0<mask_operand3>|%0<mask_operand3>, %1, %2<avx512bcst>}" [(set_attr "prefix" "evex") (set_attr "type" "sseadd") (set_attr "mode" "<MODE>")]) -(define_insn "*add<mode>3<mask_name>_bcst_2" +(define_insn "*add<mode>3<mask_name>_bcst" [(set (match_operand:VF_AVX512 0 "register_operand" "=v") (plus:VF_AVX512 (vec_duplicate:VF_AVX512 -- 2.17.2