https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Hongtao.liu from comment #5)
> > (In reply to Hongtao.liu from comment #4)
> > > (In reply to Hongtao.liu from comment #3)
> > > > (In reply to Hongtao.liu from comment #2)
> > > > > in Gimple, there're
> > > > > 
> > > > >   _1 = __builtin_memcmp_eq (a_5(D), &t[0], 32);
> > > > >   _2 = _1 == 0;
> > > > >   _6 = (int) _2;
> > > > > 
> > > > > 
> > > > > So it's related to codegen optimization with vectorized codes for
> > > > > __builtin_memcmp_eq, guess we can start with size multiple of 16 
> > > > > bytes?
> > > > > 
> > > > There's no optab or target_hook for backend to participate in 
> > > > optimization
> > But there's cbranch_optab check in can_compare_p, and i386 supports
> > V8SI/V4DI/V4SI/V2DI, but not for OI/TI, adding support for them?
> > 
> > 25899(define_expand "cbranch<mode>4"
> > 25900  [(set (reg:CC FLAGS_REG)
> > 25901        (compare:CC (match_operand:VI48_AVX 1 "register_operand")
> > 25902                    (match_operand:VI48_AVX 2 "nonimmediate_operand")))
> > 25903   (set (pc) (if_then_else
> > 25904               (match_operator 0 "bt_comparison_operator"
> > 25905                [(reg:CC FLAGS_REG) (const_int 0)])
> > 25906               (label_ref (match_operand 3))
> 
> After supporting cbranchoi4, gcc generates
> 
> _Z1fPc:
> .LFB0:
>         .cfi_startproc
>         vmovdqa .LC1(%rip), %ymm0
>         vpxor   (%rdi), %ymm0, %ymm0
>         vptest  %ymm0, %ymm0
>         sete    %al
>         vzeroupper
> 
> which is optimal as clang/llvm does.

Also extend cbranchti to ptest when target_sse4_1 and CODE == NE || CODE == EQ
so gcc generates 

        movdqu  (%rdi), %xmm0
        movdqa  .LC1(%rip), %xmm1
        pxor    %xmm1, %xmm0
        ptest   %xmm0, %xmm0
        sete    %al

for 

bool f128(char *a)
{
  char t[] = "012345678901234";
  return __builtin_memcmp(a, &t[0], sizeof(t)) == 0;
}

the original codegen is

        movabsq $14692989455579448, %rax
        xorq    8(%rdi), %rax
        movabsq $3978425819141910832, %rdx
        xorq    (%rdi), %rdx
        orq     %rdx, %rax
        sete    %al
        ret

Reply via email to