https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70251
--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> --- On Sat, 19 Mar 2016, glisse at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70251 > > --- Comment #9 from Marc Glisse <glisse at gcc dot gnu.org> --- > I don't find the current situation very consistent. We seem to consider that > in > gimple, the output of a vector comparison can only be used directly in a > vec_cond_expr (possibly also {and,ior,xor,bit_not}_expr?), except in the very > specific case where we apply this transformation a+(b?1:0) -> a-VCE(b), where > we only have a view_convert_expr. True. OTOH VIEW_CONVERT_EXPR is quite a special beast. > Unless we start transforming x?-1:0 into > view_convert_expr(x) under the same condition (VECTOR_MODE_P(...)), it doesn't > make sense to me. I think that would be a valid transform. > I am also not convinced that VECTOR_MODE_P is the right test. > If you are using a long vector (say of 2048 bits), the mode will never be a > vector mode, so at best the transformation is delayed until after vector > lowering. True, I was also thinking that a TYPE_SIZE check would be better to verify the validity of the V_C_E. > By the way, nothing seems to combine the following into a single > statement: > c_3 = VEC_COND_EXPR <x_1(D) < y_2(D), { -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, > -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; > _4 = VEC_COND_EXPR <c_3 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }, { 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0 }>; > (it does get combined in forwprop4 after the vector operation is lowered to > scalar operations (yes, we are missing the code to use sub-vectors here)) This means we are missing a simplifier for (for cnd (cond vec_cond) (for cmp (ne eq) (cmp (cnd @0 CONSTANT_CLASS_P@1 CONSTANT_CLASS_P@2) ... possibly also for the scalar COND_EXPR case. > From a code generation point of view: > > typedef int vec __attribute__((vector_size(32))); > vec f(vec x,vec y,vec z){ > vec zero={}; > vec one=zero+1; > vec c=x<y; > return z+(c?one:zero); > } > > -mavx2 generates > > vpcmpgtd %ymm0, %ymm1, %ymm1 > vpsubd %ymm1, %ymm2, %ymm0 > > while -march=skylake-avx512 gives the inferior > > vpcmpgtd %ymm0, %ymm1, %ymm1 > vpandd .LC0(%rip), %ymm1, %ymm0 > vpaddd %ymm2, %ymm0, %ymm0 > > > > (In reply to rguent...@suse.de from comment #5) > > On Wed, 16 Mar 2016, glisse at gcc dot gnu.org wrote: > > > A + (B vcmp C ? 1 : 0) -> A - (B vcmp C ? -1 : 0) > > I think that would be an odd transform. > > As far as I understand, since the bool vector changes, this is currently the > proper gimple syntax for this transformation, with the added bonus that it > doesn't need to be disabled for avx512. Sure, but nowhere else we transform a + into a - operation(?) This is what I mean with "odd transform". We'd get this motivated only by (B vcmp C ? -1 : 0) being "canonical" for target expansion, right? > (I got "incorrect sharing of tree nodes" on the first argument of > vec_cond_expr > when I wanted to try it out, the match-simplify machinery should probably do > the unshare_expr automatically, especially since gimplify now wants to keep > the > comparison inside the vec_cond_expr) Ah, yes, that's sth we need to fix. Care to share the pattern/testcase that triggers this? The place to fix seems a bit non-obvious. Richard.