On Thu, Mar 17, 2016 at 1:02 PM, Ilya Enkovich <enkovich....@gmail.com> wrote:
> Hi,
>
> Current widening and narrowing vectorization functions may work
> incorrectly for scalar masks because we may have different boolean
> vector types having the same mode.  E.g. vec<bool>(4) and vec<bool>(8)
> both have QImode.  That means if we need to convert vec<bool>(4) into
> vec<bool>(16) we may actually find QImode->HImode conversion optab
> and try to use it which is incorrect because this optab entry is
> used for vec<bool>(8) to vec<bool>(16) conversion.
>
> I suppose the best fix for GCC 6 is to just catch and disable such
> conversion by checking number of vetor elements.  It doesn't disable
> any vectorization because we don't have any conversion patterns
> for vec<bool>(4) anyway.
>
> It's not clear what to do for GCC 7 though to enable such conversions.
> It looks like for AVX-512 we have three boolean vectors sharing the
> same QImode: vec<bool>(2), vec<bool>(4) and vec<bool>(8).  It means
> we can't use optabs to check operations on these vectors even
> using conversion optabs instead of direct ones.  Can we use half/quarter
> byte modes for such masks or something like that?  Another option is to
> handle their conversion separately with no optabs usage at all (target 
> hooks?).
>
> The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
> OK for trunk?

Ok.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-03-17  Ilya Enkovich  <enkovich....@gmail.com>
>
>         PR tree-optimization/70252
>         * tree-vect-stmts.c (supportable_widening_operation): Check resulting
>         boolean vector has a proper number of elements.
>         (supportable_narrowing_operation): Likewise.
>
> gcc/testsuite/
>
> 2016-03-17  Ilya Enkovich  <enkovich....@gmail.com>
>
>         PR tree-optimization/70252
>         * gcc.dg/pr70252.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c
> new file mode 100644
> index 0000000..209e691
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr70252.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* 
> x86_64-*-* } } } */
> +
> +extern unsigned char a [150];
> +extern unsigned char b [150];
> +extern unsigned char c [150];
> +extern unsigned char d [150];
> +extern unsigned char e [150];
> +
> +void foo () {
> +  for (int i = 92; i <= 141; i += 2) {
> +    int tmp = (d [i] && b [i]) <= (a [i] > c [i]);
> +    e [i] = tmp >> b [i];
> +  }
> +}
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 06b1ab7..d12c062 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -8940,7 +8940,12 @@ supportable_widening_operation (enum tree_code code, 
> gimple *stmt,
>
>    if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>        && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -    return true;
> +      /* For scalar masks we may have different boolean
> +        vector types having the same QImode.  Thus we
> +        add additional check for elements number.  */
> +    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +           || (TYPE_VECTOR_SUBPARTS (vectype) / 2
> +               == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>    /* Check if it's a multi-step conversion that can be done using 
> intermediate
>       types.  */
> @@ -8991,7 +8996,9 @@ supportable_widening_operation (enum tree_code code, 
> gimple *stmt,
>
>        if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>           && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -       return true;
> +       return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +               || (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2
> +                   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>        prev_type = intermediate_type;
>        prev_mode = intermediate_mode;
> @@ -9075,7 +9082,12 @@ supportable_narrowing_operation (enum tree_code code,
>    *code1 = c1;
>
>    if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -    return true;
> +    /* For scalar masks we may have different boolean
> +       vector types having the same QImode.  Thus we
> +       add additional check for elements number.  */
> +    return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +           || (TYPE_VECTOR_SUBPARTS (vectype) * 2
> +               == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>    /* Check if it's a multi-step conversion that can be done using 
> intermediate
>       types.  */
> @@ -9140,7 +9152,9 @@ supportable_narrowing_operation (enum tree_code code,
>        (*multi_step_cvt)++;
>
>        if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -       return true;
> +       return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +               || (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2
> +                   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>        prev_mode = intermediate_mode;
>        prev_type = intermediate_type;

Reply via email to