On Fri, May 25, 2018 at 12:07:33PM +0800, wei.guo.si...@gmail.com wrote:
>  _GLOBAL(memcmp)
>       cmpdi   cr1,r5,0
>  
> -     /* Use the short loop if both strings are not 8B aligned */
> -     or      r6,r3,r4
> +     /* Use the short loop if the src/dst addresses are not
> +      * with the same offset of 8 bytes align boundary.
> +      */
> +     xor     r6,r3,r4
>       andi.   r6,r6,7
>  
> -     /* Use the short loop if length is less than 32B */
> -     cmpdi   cr6,r5,31
> +     /* Fall back to short loop if compare at aligned addrs
> +      * with less than 8 bytes.
> +      */
> +     cmpdi   cr6,r5,7
>  
>       beq     cr1,.Lzero
> -     bne     .Lshort
> -     bgt     cr6,.Llong
> +     bgt     cr6,.Lno_short

If this doesn't use cr0 anymore, you can do  rlwinm r6,r6,0,7  instead of
andi r6,r6,7 .

> +.Lsameoffset_8bytes_make_align_start:
> +     /* attempt to compare bytes not aligned with 8 bytes so that
> +      * rest comparison can run based on 8 bytes alignment.
> +      */
> +     andi.   r6,r3,7
> +
> +     /* Try to compare the first double word which is not 8 bytes aligned:
> +      * load the first double word at (src & ~7UL) and shift left appropriate
> +      * bits before comparision.
> +      */
> +     clrlwi  r6,r3,29
> +     rlwinm  r6,r6,3,0,28

Those last two lines are together just
  rlwinm r6,r3,3,0x1c

> +     subfc.  r5,r6,r5

Why subfc?  You don't use the carry.

> +     rlwinm  r6,r6,3,0,28

That's
  slwi r6,r6,3

> +     bgt     cr0,8f
> +     li      r3,-1
> +8:
> +     blr

  blelr
  li r3,-1
  blr

(and more of the same things elsewhere).


Segher

Reply via email to