On Sat, Mar 25, 2017 at 07:41:17PM +0300, Alexey Dobriyan wrote:
> Current addr4_match() code has special test for /0 prefixes because of
> standard required undefined behaviour. However, it is possible to omit
> it on 64-bit because shifting can be done within a 64-bit register and
> then truncated to the expected value (which is 0 mask).
> 
> Implicit truncation by htonl() fits nicely into R32-within-R64 model
> on x86-64.
> 
> Space savings: none (coincidence)
> Branch savings: 1
> 
> Before:
> 
>       movzx  eax,BYTE PTR [rdi+0x2a]          # ->prefixlen_d
>       test   al,al
>       jne    xfrm_selector_match + 0x23f
>               ...
>       movzx  eax,BYTE PTR [rbx+0x2b]          # ->prefixlen_s
>       test   al,al
>       je     xfrm_selector_match + 0x1c7
> 
> After (no branches):
> 
>       mov    r8d,0x20
>       mov    rdx,0xffffffffffffffff
>       mov    esi,DWORD PTR [rsi+0x2c]
>       mov    ecx,r8d
>       sub    cl,BYTE PTR [rdi+0x2a]
>       xor    esi,DWORD PTR [rbx]
>       mov    rdi,rdx
>       xor    eax,eax
>       shl    rdi,cl
>       bswap  edi
> 
> Signed-off-by: Alexey Dobriyan <adobri...@gmail.com>

Also applied to ipsec-next, thanks for the patches Alexey!

Reply via email to