https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103948
Bug ID: 103948 Summary: Vectorizer does not use vec_cmpMN without vcondMN pattern Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- I was trying to add v2qi vec_cmpv2qiv2qi pattern to x86: (define_expand "vec_cmpv2qiv2qi" [(set (match_operand:V2QI 0 "register_operand") (match_operator:V2QI 1 "" [(match_operand:V2QI 2 "register_operand") (match_operand:V2QI 3 "register_operand")]))] "TARGET_SSE2" { bool ok = ix86_expand_int_vec_cmp (operands); gcc_assert (ok); DONE; }) but the vectorizer does not consider the above pattern *unless* vcondv2qiv2qi is also present: (define_expand "vcondv2qiv2qi" [(set (match_operand:V2QI 0 "register_operand") (if_then_else:V2QI (match_operator 3 "" [(match_operand:V2QI 4 "register_operand") (match_operand:V2QI 5 "register_operand")]) (match_operand:V2QI 1) (match_operand:V2QI 2)))] "TARGET_SSE4_1") As shown above, the pattern does not need to expand to anything, just needs to be present. So the following testcase: --cut here-- typedef signed char vec __attribute__((vector_size(2))); vec lt (vec a, vec b) { return a < b; } --cut here-- vectorizes with -msse4 and fails to vectorize with -msse2. Looking a bit into tree-vect-generic.c, in expand_vector_comparison we do: /* Try to expand vector comparison expression OP0 CODE OP1 by querying optab if the following expression: VEC_COND_EXPR< OP0 CODE OP1, {-1,...}, {0,...}> can be expanded. */ but apparenlty only via vcondMN optab. According to the documentation, vec_cmpMN does exactly the above: 'vec_cmpMN' Output a vector comparison. Operand 0 of mode N is the destination for predicate in operand 1 which is a signed vector comparison with operands of mode M in operands 2 and 3. Predicate is computed by element-wise evaluation of the vector comparison with a truth value of all-ones and a false value of all-zeros. so, support should query vec_cmpMN optab (and vec_vmpeqMN) in addition to vcondMN optab. I'll attach the complete patch to illustrate the issue on x86_64.