On Thu, Jul 9, 2020 at 6:35 AM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Thu, Jul 9, 2020 at 5:04 AM Kirill Yukhin <kirill.yuk...@gmail.com> wrote:
> >
> > On 07 июл 09:06, H.J. Lu wrote:
> > > On Tue, Jul 7, 2020 at 8:56 AM Kirill Yukhin <kirill.yuk...@gmail.com> 
> > > wrote:
> > > >
> > > > Hello HJ,
> > > >
> > > > On 28 июн 07:19, H.J. Lu via Gcc-patches wrote:
> > > > > Enable FMA in rsqrt<mode>2 expander and fold rsqrtv16sf2 expander into
> > > > > rsqrt<mode>2 expander which expands to UNSPEC_RSQRT28 for 
> > > > > TARGET_AVX512ER.
> > > > > Although it doesn't show performance change in our workloads, FMA can
> > > > > improve other workloads.
> > > > >
> > > > > gcc/
> > > > >
> > > > >       PR target/88713
> > > > >       * config/i386/i386-expand.c (ix86_emit_swsqrtsf): Enable FMA.
> > > > >       * config/i386/sse.md (VF_AVX512VL_VF1_128_256): New.
> > > > >       (rsqrt<mode>2): Replace VF1_128_256 with 
> > > > > VF_AVX512VL_VF1_128_256.
> > > > >       (rsqrtv16sf2): Removed.
> > > > >
> > > > > gcc/testsuite/
> > > > >
> > > > >       PR target/88713
> > > > >       * gcc.target/i386/pr88713-1.c: New test.
> > > > >       * gcc.target/i386/pr88713-2.c: Likewise.
> > > >
> > > > So, you've introduced new rsqrt expanders for DF vectors and relaxed
> > > > condition for V16SF. What I didn't get is why did you change unspec
> > > > type from RSQRT to RSQRT28 for V16SF expander?
> > > >
> > >
> > > UNSPEC in define_expand is meaningless when the pattern is fully
> > > expanded by ix86_emit_swsqrtsf.  I believe that UNSPEC in rsqrt<mode>2
> > > expander can be removed.
> >
> > Agree.
>
> I will leave UNSPEC alone here.
>
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr88713-1.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -Ofast -mno-avx512f -mfma" } */
> >
> > I gues -O2 is useless here (and in -2.c test).
>
> Fixed.
>
> > Othwerwise LGTM.
> >
>
> This is the patch I am checking in.
>

Since ix86_emit_swsqrtsf shouldn't be called with DF vector modes, rename
VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256 and drop DF vector modes.

-- 
H.J.
From 9a984a4ecc96cfa53b203be065f365ca7b1f3bf0 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 13 Jul 2020 09:07:00 -0700
Subject: [PATCH] x86: Rename VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256

Since ix86_emit_swsqrtsf shouldn't be called with DF vector modes, rename
VF_AVX512VL_VF1_128_256 to VF1_AVX512ER_128_256 and drop DF vector modes.

gcc/

	PR target/96186
	PR target/88713
	* config/i386/sse.md (VF_AVX512VL_VF1_128_256): Renamed to ...
	(VF1_AVX512ER_128_256): This.  Drop DF vector modes.
	(rsqrt<mode>2): Replace VF_AVX512VL_VF1_128_256 with
	VF1_AVX512ER_128_256.

gcc/testsuite/

	PR target/96186
	PR target/88713
	* gcc.target/i386/pr88713-3.c: New test.
---
 gcc/config/i386/sse.md                    | 14 ++++++--------
 gcc/testsuite/gcc.target/i386/pr88713-3.c | 17 +++++++++++++++++
 2 files changed, 23 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88713-3.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d3ad5833e1f..b6348de67cb 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -326,11 +326,9 @@ (define_mode_iterator VF_AVX512VL
   [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")
    V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")])
 
-;; AVX512VL SF/DF plus 128- and 256-bit SF vector modes
-(define_mode_iterator VF_AVX512VL_VF1_128_256
-  [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
-   (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX512VL")
-   (V2DF "TARGET_AVX512VL")])
+;; AVX512ER SF plus 128- and 256-bit SF vector modes
+(define_mode_iterator VF1_AVX512ER_128_256
+  [(V16SF "TARGET_AVX512ER") (V8SF "TARGET_AVX") V4SF])
 
 (define_mode_iterator VF2_AVX512VL
   [V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")])
@@ -2076,9 +2074,9 @@ (define_insn "*<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>"
    (set_attr "mode" "<ssescalarmode>")])
 
 (define_expand "rsqrt<mode>2"
-  [(set (match_operand:VF_AVX512VL_VF1_128_256 0 "register_operand")
-	(unspec:VF_AVX512VL_VF1_128_256
-	  [(match_operand:VF_AVX512VL_VF1_128_256 1 "vector_operand")]
+  [(set (match_operand:VF1_AVX512ER_128_256 0 "register_operand")
+	(unspec:VF1_AVX512ER_128_256
+	  [(match_operand:VF1_AVX512ER_128_256 1 "vector_operand")]
 	  UNSPEC_RSQRT))]
   "TARGET_SSE && TARGET_SSE_MATH"
 {
diff --git a/gcc/testsuite/gcc.target/i386/pr88713-3.c b/gcc/testsuite/gcc.target/i386/pr88713-3.c
new file mode 100644
index 00000000000..85b6cf87a02
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88713-3.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -mno-avx512er -march=skylake-avx512" } */
+
+#include <math.h>
+
+double square(double d[3], double rad)
+{
+  double res[3];
+
+  for (int i = 0; i < 3; i++)
+    {
+      res[i] = d[i] * d[i];
+      res[i] *= rad/sqrt(res[i]);
+    }
+
+  return res[0];
+}
-- 
2.26.2

Reply via email to