https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117000
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|target |tree-optimization Status|UNCONFIRMED |ASSIGNED Version|unknown |13.3.0 Last reconfirmed| |2024-10-08 Ever confirmed|0 |1 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- In particular we miss the fact that _29 = .REDUC_IOR (vect_folded_10.32_23); _12 = _29 == 0; could be optimized to _12 = vect_folded_10.32_23 == {0, 0, ... }; it's probably too late for RTL to realize this. Some pattern in match.pd could handle this, like (for cmp (eq ne) (simplify (cmp (IFN_REDUC_IOR @0) integer_zerop) (cmp @0 { build_zero_cst (TREE_TYPE (@0)); } ))) results in _Z5test1RK4U256: .LFB5: .cfi_startproc movdqu (%rdi), %xmm0 movdqu 16(%rdi), %xmm1 por %xmm1, %xmm0 ptest %xmm0, %xmm0 sete %al ret _Z5test2RK4U256: .LFB6: .cfi_startproc movdqu 16(%rdi), %xmm0 movdqu (%rdi), %xmm1 por %xmm1, %xmm0 ptest %xmm0, %xmm0 sete %al ret