On Thu, Jul 9, 2020 at 6:35 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Thu, Jul 9, 2020 at 5:04 AM Kirill Yukhin <kirill.yuk...@gmail.com> wrote: > > > > On 07 июл 09:06, H.J. Lu wrote: > > > On Tue, Jul 7, 2020 at 8:56 AM Kirill Yukhin <kirill.yuk...@gmail.com> > > > wrote: > > > > > > > > Hello HJ, > > > > > > > > On 28 июн 07:19, H.J. Lu via Gcc-patches wrote: > > > > > Enable FMA in rsqrt<mode>2 expander and fold rsqrtv16sf2 expander into > > > > > rsqrt<mode>2 expander which expands to UNSPEC_RSQRT28 for > > > > > TARGET_AVX512ER. > > > > > Although it doesn't show performance change in our workloads, FMA can > > > > > improve other workloads. > > > > > > > > > > gcc/ > > > > > > > > > > PR target/88713 > > > > > * config/i386/i386-expand.c (ix86_emit_swsqrtsf): Enable FMA. > > > > > * config/i386/sse.md (VF_AVX512VL_VF1_128_256): New. > > > > > (rsqrt<mode>2): Replace VF1_128_256 with > > > > > VF_AVX512VL_VF1_128_256. > > > > > (rsqrtv16sf2): Removed. > > > > > > > > > > gcc/testsuite/ > > > > > > > > > > PR target/88713 > > > > > * gcc.target/i386/pr88713-1.c: New test. > > > > > * gcc.target/i386/pr88713-2.c: Likewise. > > > > > > > > So, you've introduced new rsqrt expanders for DF vectors and relaxed > > > > condition for V16SF. What I didn't get is why did you change unspec > > > > type from RSQRT to RSQRT28 for V16SF expander? > > > > > > > > > > UNSPEC in define_expand is meaningless when the pattern is fully > > > expanded by ix86_emit_swsqrtsf. I believe that UNSPEC in rsqrt<mode>2 > > > expander can be removed. > > > > Agree. > > I will leave UNSPEC alone here. > > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr88713-1.c > > @@ -0,0 +1,13 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -Ofast -mno-avx512f -mfma" } */ > > > > I gues -O2 is useless here (and in -2.c test). > > Fixed. > > > Othwerwise LGTM. > > > > This is the patch I am checking in. >
Since ix86_emit_swsqrtsf shouldn't be called with DF vector modes, rename VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256 and drop DF vector modes. -- H.J.
From 9a984a4ecc96cfa53b203be065f365ca7b1f3bf0 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.tools@gmail.com> Date: Mon, 13 Jul 2020 09:07:00 -0700 Subject: [PATCH] x86: Rename VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256 Since ix86_emit_swsqrtsf shouldn't be called with DF vector modes, rename VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256 and drop DF vector modes. gcc/ PR target/96186 PR target/88713 * config/i386/sse.md (VF_AVX512VL_VF1_128_256): Renamed to ... (VF1_AVX512ER_128_256): This. Drop DF vector modes. (rsqrt<mode>2): Replace VF_AVX512VL_VF1_128_256 with VF1_AVX512ER_128_256. gcc/testsuite/ PR target/96186 PR target/88713 * gcc.target/i386/pr88713-3.c: New test. --- gcc/config/i386/sse.md | 14 ++++++-------- gcc/testsuite/gcc.target/i386/pr88713-3.c | 17 +++++++++++++++++ 2 files changed, 23 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr88713-3.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d3ad5833e1f..b6348de67cb 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -326,11 +326,9 @@ (define_mode_iterator VF_AVX512VL [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL") V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) -;; AVX512VL SF/DF plus 128- and 256-bit SF vector modes -(define_mode_iterator VF_AVX512VL_VF1_128_256 - [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF - (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX512VL") - (V2DF "TARGET_AVX512VL")]) +;; AVX512ER SF plus 128- and 256-bit SF vector modes +(define_mode_iterator VF1_AVX512ER_128_256 + [(V16SF "TARGET_AVX512ER") (V8SF "TARGET_AVX") V4SF]) (define_mode_iterator VF2_AVX512VL [V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) @@ -2076,9 +2074,9 @@ (define_insn "*<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>" (set_attr "mode" "<ssescalarmode>")]) (define_expand "rsqrt<mode>2" - [(set (match_operand:VF_AVX512VL_VF1_128_256 0 "register_operand") - (unspec:VF_AVX512VL_VF1_128_256 - [(match_operand:VF_AVX512VL_VF1_128_256 1 "vector_operand")] + [(set (match_operand:VF1_AVX512ER_128_256 0 "register_operand") + (unspec:VF1_AVX512ER_128_256 + [(match_operand:VF1_AVX512ER_128_256 1 "vector_operand")] UNSPEC_RSQRT))] "TARGET_SSE && TARGET_SSE_MATH" { diff --git a/gcc/testsuite/gcc.target/i386/pr88713-3.c b/gcc/testsuite/gcc.target/i386/pr88713-3.c new file mode 100644 index 00000000000..85b6cf87a02 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr88713-3.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -mno-avx512er -march=skylake-avx512" } */ + +#include <math.h> + +double square(double d[3], double rad) +{ + double res[3]; + + for (int i = 0; i < 3; i++) + { + res[i] = d[i] * d[i]; + res[i] *= rad/sqrt(res[i]); + } + + return res[0]; +} -- 2.26.2