I think that fix for avx2 part should be backported to 4.8/4.9
What do you think?

On 14 Jan 14:18, Ilya Tocar wrote:
> Hi,
> 
> This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387
> Which was caused by different in predicates between vec_unpacks_hi
> and vec_extract_hi.
> Ok for trunk?
> ChangeLog:
> 
> gcc/
>       PR target/64387
>       * config/i386/sse.md (vec_unpacks_hi_v8sf): Fix predicate.
>       (vec_unpacks_hi_v16sf): Ditto.
> 
> testsuite/
>       PR target/64387
>       * gcc.target/i386/pr64387.c: New test.
> ---
>  gcc/config/i386/sse.md                  |  4 ++--
>  gcc/testsuite/gcc.target/i386/pr64387.c | 15 +++++++++++++++
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr64387.c
> 
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 3b9108c..cd4af4e 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -5078,7 +5078,7 @@
>  (define_expand "vec_unpacks_hi_v8sf"
>    [(set (match_dup 2)
>       (vec_select:V4SF
> -       (match_operand:V8SF 1 "nonimmediate_operand")
> +       (match_operand:V8SF 1 "register_operand")
>         (parallel [(const_int 4) (const_int 5)
>                    (const_int 6) (const_int 7)])))
>     (set (match_operand:V4DF 0 "register_operand")
> @@ -5090,7 +5090,7 @@
>  (define_expand "vec_unpacks_hi_v16sf"
>    [(set (match_dup 2)
>       (vec_select:V8SF
> -       (match_operand:V16SF 1 "nonimmediate_operand")
> +       (match_operand:V16SF 1 "register_operand")
>         (parallel [(const_int 8) (const_int 9)
>                    (const_int 10) (const_int 11)
>                    (const_int 12) (const_int 13)
> diff --git a/gcc/testsuite/gcc.target/i386/pr64387.c 
> b/gcc/testsuite/gcc.target/i386/pr64387.c
> new file mode 100644
> index 0000000..dd38142
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr64387.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize -ffloat-store -mavx512er" } */
> +
> +float x[256];
> +
> +double *
> +foo (void)
> +{
> +  double *z = __builtin_malloc (sizeof (double) * 256);
> +  int i;
> +  for (i = 0; i < 256; ++i)
> +    z[i] = x[i] + 1.0f;
> +  foo ();
> +  return 0;
> +}
> -- 
> 1.8.3.1
> 

Reply via email to