On Thu, Mar 17, 2016 at 1:02 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > Hi, > > Current widening and narrowing vectorization functions may work > incorrectly for scalar masks because we may have different boolean > vector types having the same mode. E.g. vec<bool>(4) and vec<bool>(8) > both have QImode. That means if we need to convert vec<bool>(4) into > vec<bool>(16) we may actually find QImode->HImode conversion optab > and try to use it which is incorrect because this optab entry is > used for vec<bool>(8) to vec<bool>(16) conversion. > > I suppose the best fix for GCC 6 is to just catch and disable such > conversion by checking number of vetor elements. It doesn't disable > any vectorization because we don't have any conversion patterns > for vec<bool>(4) anyway. > > It's not clear what to do for GCC 7 though to enable such conversions. > It looks like for AVX-512 we have three boolean vectors sharing the > same QImode: vec<bool>(2), vec<bool>(4) and vec<bool>(8). It means > we can't use optabs to check operations on these vectors even > using conversion optabs instead of direct ones. Can we use half/quarter > byte modes for such masks or something like that? Another option is to > handle their conversion separately with no optabs usage at all (target > hooks?). > > The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu. > OK for trunk?
Ok. Richard. > Thanks, > Ilya > -- > gcc/ > > 2016-03-17 Ilya Enkovich <enkovich....@gmail.com> > > PR tree-optimization/70252 > * tree-vect-stmts.c (supportable_widening_operation): Check resulting > boolean vector has a proper number of elements. > (supportable_narrowing_operation): Likewise. > > gcc/testsuite/ > > 2016-03-17 Ilya Enkovich <enkovich....@gmail.com> > > PR tree-optimization/70252 > * gcc.dg/pr70252.c: New test. > > > diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c > new file mode 100644 > index 0000000..209e691 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr70252.c > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3" } */ > +/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* > x86_64-*-* } } } */ > + > +extern unsigned char a [150]; > +extern unsigned char b [150]; > +extern unsigned char c [150]; > +extern unsigned char d [150]; > +extern unsigned char e [150]; > + > +void foo () { > + for (int i = 92; i <= 141; i += 2) { > + int tmp = (d [i] && b [i]) <= (a [i] > c [i]); > + e [i] = tmp >> b [i]; > + } > +} > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 06b1ab7..d12c062 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -8940,7 +8940,12 @@ supportable_widening_operation (enum tree_code code, > gimple *stmt, > > if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) > && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) > - return true; > + /* For scalar masks we may have different boolean > + vector types having the same QImode. Thus we > + add additional check for elements number. */ > + return (!VECTOR_BOOLEAN_TYPE_P (vectype) > + || (TYPE_VECTOR_SUBPARTS (vectype) / 2 > + == TYPE_VECTOR_SUBPARTS (wide_vectype))); > > /* Check if it's a multi-step conversion that can be done using > intermediate > types. */ > @@ -8991,7 +8996,9 @@ supportable_widening_operation (enum tree_code code, > gimple *stmt, > > if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype) > && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype)) > - return true; > + return (!VECTOR_BOOLEAN_TYPE_P (vectype) > + || (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2 > + == TYPE_VECTOR_SUBPARTS (wide_vectype))); > > prev_type = intermediate_type; > prev_mode = intermediate_mode; > @@ -9075,7 +9082,12 @@ supportable_narrowing_operation (enum tree_code code, > *code1 = c1; > > if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype)) > - return true; > + /* For scalar masks we may have different boolean > + vector types having the same QImode. Thus we > + add additional check for elements number. */ > + return (!VECTOR_BOOLEAN_TYPE_P (vectype) > + || (TYPE_VECTOR_SUBPARTS (vectype) * 2 > + == TYPE_VECTOR_SUBPARTS (narrow_vectype))); > > /* Check if it's a multi-step conversion that can be done using > intermediate > types. */ > @@ -9140,7 +9152,9 @@ supportable_narrowing_operation (enum tree_code code, > (*multi_step_cvt)++; > > if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype)) > - return true; > + return (!VECTOR_BOOLEAN_TYPE_P (vectype) > + || (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2 > + == TYPE_VECTOR_SUBPARTS (narrow_vectype))); > > prev_mode = intermediate_mode; > prev_type = intermediate_type;