On Mon, Feb 10, 2020 at 3:33 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> As mentioned in the PR, for -mavx -mno-avx2 the backend does support
> vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
> aren't much supported in that case, it is performed using
> vandps/vandnps/vorps).  The problem is that after the last generic vector
> lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
> has two V4DI last operands and V4DI result and so is considered ok) fre4
> folds the condition into constant, at which point the middle-end during
> expansion will try vcond_mask_optab and fall back to trying to expand it
> as the constant vector < 0 vcondv4div4di, but neither of them is supported
> for -mavx -mno-avx2 and thus we ICE.
>
> So, the options I see is either what the following patch does, also support
> vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
> require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
> TARGET_AVX.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> Or do you prefer the disabling of the vcond patterns instead?
>
> 2020-02-10  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/93637
>         * config/i386/sse.md (VI_256_AVX2): New mode iterator.
>         (vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
>         Change condition from TARGET_AVX2 to TARGET_AVX.
>
>         * gcc.target/i386/avx-pr93637.c: New test.

LGTM as an ICE fix for stage4. We can try Richard's suggestion "later".

Thanks,
Uros.

>
> --- gcc/config/i386/sse.md.jj   2020-02-10 13:14:02.970131692 +0100
> +++ gcc/config/i386/sse.md      2020-02-10 13:15:54.343473253 +0100
> @@ -3430,13 +3430,19 @@ (define_expand "vcond_mask_<mode><avx512
>           (match_operand:<avx512fmaskmode> 3 "register_operand")))]
>    "TARGET_AVX512BW")
>
> +;; As vcondv4div4df and vcondv8siv8sf are enabled already with TARGET_AVX,
> +;; and their condition can be folded late into a constant, we need to
> +;; support vcond_mask_v4div4di and vcond_mask_v8siv8si for TARGET_AVX.
> +(define_mode_iterator VI_256_AVX2 [(V32QI "TARGET_AVX2") (V16HI 
> "TARGET_AVX2")
> +                                  V8SI V4DI])
> +
>  (define_expand "vcond_mask_<mode><sseintvecmodelower>"
> -  [(set (match_operand:VI_256 0 "register_operand")
> -       (vec_merge:VI_256
> -         (match_operand:VI_256 1 "nonimmediate_operand")
> -         (match_operand:VI_256 2 "nonimm_or_0_operand")
> +  [(set (match_operand:VI_256_AVX2 0 "register_operand")
> +       (vec_merge:VI_256_AVX2
> +         (match_operand:VI_256_AVX2 1 "nonimmediate_operand")
> +         (match_operand:VI_256_AVX2 2 "nonimm_or_0_operand")
>           (match_operand:<sseintvecmode> 3 "register_operand")))]
> -  "TARGET_AVX2"
> +  "TARGET_AVX"
>  {
>    ix86_expand_sse_movcc (operands[0], operands[3],
>                          operands[1], operands[2]);
> --- gcc/testsuite/gcc.target/i386/avx-pr93637.c.jj      2020-02-10 
> 13:19:18.212437488 +0100
> +++ gcc/testsuite/gcc.target/i386/avx-pr93637.c 2020-02-10 13:18:25.651220171 
> +0100
> @@ -0,0 +1,17 @@
> +/* PR target/93637 */
> +/* { dg-do compile } */
> +/* { dg-options "-mavx -mno-avx2 -O3 --param 
> sccvn-max-alias-queries-per-access=3" } */
> +
> +double
> +foo (void)
> +{
> +  int i;
> +  double r = 7.0;
> +  double a[] = { 0.0, 0.0, -0.0, 0.0, 0.0, -0.0, 1.0, 0.0, 0.0, -0.0, 1.0, 
> 0.0, 1.0, 1.0 };
> +
> +  for (i = 0; i < sizeof (a) / sizeof (a[0]); ++i)
> +    if (a[i] == 0.0)
> +      r = a[i];
> +
> +  return r;
> +}
>
>         Jakub
>

Reply via email to