https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948

            Bug ID: 103948
           Summary: Vectorizer does not use vec_cmpMN without vcondMN
                    pattern
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

I was trying to add v2qi vec_cmpv2qiv2qi pattern to x86:

(define_expand "vec_cmpv2qiv2qi"
  [(set (match_operand:V2QI 0 "register_operand")
        (match_operator:V2QI 1 ""
          [(match_operand:V2QI 2 "register_operand")
           (match_operand:V2QI 3 "register_operand")]))]
  "TARGET_SSE2"
{
  bool ok = ix86_expand_int_vec_cmp (operands);
  gcc_assert (ok);
  DONE;
})

but the vectorizer does not consider the above pattern *unless* vcondv2qiv2qi
is also present:

(define_expand "vcondv2qiv2qi"
  [(set (match_operand:V2QI 0 "register_operand")
        (if_then_else:V2QI
          (match_operator 3 ""
            [(match_operand:V2QI 4 "register_operand")
             (match_operand:V2QI 5 "register_operand")])
          (match_operand:V2QI 1)
          (match_operand:V2QI 2)))]
  "TARGET_SSE4_1")

As shown above, the pattern does not need to expand to anything, just needs to
be present.

So the following testcase:

--cut here--
typedef signed char vec __attribute__((vector_size(2)));

vec lt (vec a, vec b) { return a < b; }
--cut here--

vectorizes with -msse4 and fails to vectorize with -msse2.

Looking a bit into tree-vect-generic.c, in expand_vector_comparison we do:

/* Try to expand vector comparison expression OP0 CODE OP1 by
   querying optab if the following expression:
        VEC_COND_EXPR< OP0 CODE OP1, {-1,...}, {0,...}>
   can be expanded.  */

but apparenlty only via vcondMN optab.

According to the documentation, vec_cmpMN does exactly the above:

'vec_cmpMN'
     Output a vector comparison.  Operand 0 of mode N is the destination
     for predicate in operand 1 which is a signed vector comparison with
     operands of mode M in operands 2 and 3.  Predicate is computed by
     element-wise evaluation of the vector comparison with a truth value
     of all-ones and a false value of all-zeros.

so, support should query vec_cmpMN optab (and vec_vmpeqMN) in addition to
vcondMN optab.

I'll attach the complete patch to illustrate the issue on x86_64.

Reply via email to