https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110452
Bug ID: 110452 Summary: Bad vectorization of invariant masks Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- When we have loop like double a[1024], b[1024], c[1024]; void foo (int flag, int n) { _Bool x = flag == 3; for (int i = 0; i < n; ++i) a[i] = (x ? b[i] : c[i]) * 42.; } and build it with -O2 -ftree-vectorize -march=znver4 (to avoid unswitching) we get _55 = _2 ? -1 : 0; vect_cst__56 = {_55, _55, _55, _55, _55, _55, _55, _55}; <bb 3> [local count: 567644343]: # i_14 = PHI <i_11(9), 0(21)> # vectp_b.10_49 = PHI <vectp_b.10_50(9), &b(21)> # vectp_c.13_52 = PHI <vectp_c.13_53(9), &c(21)> # vectp_a.18_62 = PHI <vectp_a.18_63(9), &a(21)> # ivtmp_65 = PHI <ivtmp_66(9), 0(21)> vect_iftmp.12_51 = MEM <vector(8) double> [(double *)vectp_b.10_49]; iftmp.0_9 = b[i_14]; vect_iftmp.15_54 = MEM <vector(8) double> [(double *)vectp_c.13_52]; iftmp.0_8 = c[i_14]; vect_patt_13.16_59 = VEC_COND_EXPR <vect_cst__56, vect_iftmp.12_51, vect_iftmp.15_54>; iftmp.0_3 = _2 ? iftmp.0_9 : iftmp.0_8; so the invariant but not constant condition _2 on the COND_EXPR is vectorized as _55 = _2 ? -1 : 0; vect_cst__56 = {_55, _55, _55, _55, _55, _55, _55, _55}; unfortunately that leads to very bad generated code cmpl $3, %edi sete %cl movl %ecx, %esi leal (%rsi,%rsi), %eax leal 0(,%rsi,4), %r9d leal 0(,%rsi,8), %r8d orl %esi, %eax orl %r9d, %eax movl %ecx, %r9d orl %r8d, %eax movl %ecx, %r8d sall $4, %r9d sall $5, %r8d sall $6, %esi orl %r9d, %eax orl %r8d, %eax movl %ecx, %r8d orl %esi, %eax sall $7, %r8d orl %r8d, %eax kmovb %eax, %k1