[Bug tree-optimization/68714] [6 Regression] less folding of vector comparison

glisse at gcc dot gnu.org Wed, 02 Mar 2016 07:26:37 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68714


--- Comment #7 from Marc Glisse <glisse at gcc dot gnu.org> ---
I find it strange that we do all operations on masks and not on "booleans" for
vectors.

typedef int T;
T f(T a,T b,T c,T d){
  return (a<b)&(c<d);
}

we generate:

  _Bool _3;
  _Bool _6;
  _Bool _7;
  T _8;

  <bb 2>:
  _3 = a_1(D) < b_2(D);
  _6 = c_4(D) < d_5(D);
  _7 = _3 & _6;
  _8 = (T) _7;
  return _8;

that is, we are happy to do the bit_and on booleans. However, with

typedef int T __attribute__((vector_size(64)));

we now generate (-mavx512f):

  _3 = VEC_COND_EXPR <a_1(D) < b_2(D), { -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
}>;
  _6 = VEC_COND_EXPR <c_4(D) < d_5(D), { -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
}>;
  _7 = _3 & _6;
  return _7;

yielding this code:

        vpcmpgtd        %zmm0, %zmm1, %k1
        vpternlogd      $0xFF, %zmm4, %zmm4, %zmm4
        vmovdqa32       %zmm4, %zmm0{%k1}{z}
        vpcmpgtd        %zmm2, %zmm3, %k1
        vmovdqa32       %zmm4, %zmm2{%k1}{z}
        vpandd  %zmm2, %zmm0, %zmm0

We perform the bit_and on the mask type, whereas it would be better to do it on
the boolean type and use 'kandw'. For most platforms, (vec_cnd x -1 0) should
be a NOP so it doesn't really matter, and for the few remaining (AVX512 and
Sparc IIRC) we want to use "booleans" as much as possible and only convert to a
mask late. I think that implies that we should pull operations on masks into
operations on booleans (as in the original patch in comment #1 maybe, plus
canonicalizing (vec_cnd x 0 -1)), and probably that forwarding conditions into
the first argument of vec_cond should only be done late (around expand).

But it is quite possible that my intuition is completely bogus here.

[Bug tree-optimization/68714] [6 Regression] less folding of vector comparison

Reply via email to