Hello!

Attached patch fixes PR 59021, where new vzeroupper instructions are
generated for -mavx after Vlad's useless insn removal patch.

The problem was, that we depent on (useless) moves to AVX256 function
argument registers in front of the function call to switch the mode to
DIRTY mode. This is not true anymore, so call_insn has to switch to
DIRTY mode by itself.

While looking at this area - always set the mode after the call_insn
in MODE_AFTER function.

2013-11-06  Uros Bizjak  <ubiz...@gmail.com>

    PR target/59021
    * config/i386/i386.c (ix86_avx_u128_mode_needed): Require
    AVX_U128_DIRTY mode for call_insn RTXes that use AVX256 registers.
    (ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY mode for call_insn
    RTXes that return in AVX256 register.

testsuite/ChangeLog:

2013-11-06  Uros Bizjak  <ubiz...@gmail.com>

    PR target/59021
    * gcc.target/i386/pr59021.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, configured with "--with-arch=core-avx-i --with-cpu=core-avx-i
--with-fpmath=avx".

The patch will be backported to 4.7 and 4.8 branch in a couple of days.

Uros.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c  (revision 204459)
+++ config/i386/i386.c  (working copy)
@@ -15708,7 +15708,7 @@ ix86_avx_u128_mode_needed (rtx insn)
              rtx arg = XEXP (XEXP (link, 0), 0);
 
              if (ix86_check_avx256_register (&arg, NULL))
-               return AVX_U128_ANY;
+               return AVX_U128_DIRTY;
            }
        }
 
@@ -15828,8 +15828,8 @@ ix86_avx_u128_mode_after (int mode, rtx insn)
     {
       bool avx_reg256_found = false;
       note_stores (pat, ix86_check_avx256_stores, &avx_reg256_found);
-      if (!avx_reg256_found)
-       return AVX_U128_CLEAN;
+
+      return avx_reg256_found ? AVX_U128_DIRTY : AVX_U128_CLEAN;
     }
 
   /* Otherwise, return current mode.  Remember that if insn
Index: testsuite/gcc.target/i386/pr59021.c
===================================================================
--- testsuite/gcc.target/i386/pr59021.c (revision 0)
+++ testsuite/gcc.target/i386/pr59021.c (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx -mvzeroupper" } */
+
+extern void abort (void);
+
+struct S {
+  int i1;
+  int i2;
+  int i3;
+};
+
+typedef double v4df  __attribute__ ((vector_size (32)));
+
+extern int foo (v4df, int i1, int i2, int i3, int i4, int i5, struct S s);
+
+void bar (v4df v, struct S s)
+{
+  int r = foo (v, 1, 2, 3, 4, 5, s);
+  if (r)
+    abort ();
+}
+
+/* { dg-final { scan-assembler-not "vzeroupper" } } */

Reply via email to