https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117000

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization
             Status|UNCONFIRMED                 |ASSIGNED
            Version|unknown                     |13.3.0
   Last reconfirmed|                            |2024-10-08
     Ever confirmed|0                           |1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
In particular we miss the fact that

  _29 = .REDUC_IOR (vect_folded_10.32_23);
  _12 = _29 == 0;

could be optimized to

  _12 = vect_folded_10.32_23 == {0, 0, ... };

it's probably too late for RTL to realize this.  Some pattern in match.pd
could handle this, like

(for cmp (eq ne)
 (simplify
  (cmp (IFN_REDUC_IOR @0) integer_zerop)
  (cmp @0 { build_zero_cst (TREE_TYPE (@0)); } )))

results in

_Z5test1RK4U256:
.LFB5:
        .cfi_startproc
        movdqu  (%rdi), %xmm0
        movdqu  16(%rdi), %xmm1
        por     %xmm1, %xmm0
        ptest   %xmm0, %xmm0
        sete    %al
        ret

_Z5test2RK4U256:
.LFB6:
        .cfi_startproc
        movdqu  16(%rdi), %xmm0
        movdqu  (%rdi), %xmm1
        por     %xmm1, %xmm0
        ptest   %xmm0, %xmm0
        sete    %al
        ret

Reply via email to