ether: use bitops to speedup comparison

Olivier Matz Tue, 02 Jul 2019 00:53:44 -0700

Hi,

On Mon, Jun 24, 2019 at 01:44:31PM -0700, Stephen Hemminger wrote:
> Using bit operations like or and xor is faster than a loop
> on all architectures. Really just explicit unrolling.
> 
> Similar cast to uint16 unaligned is already done in
> other functions here.
> 
> Signed-off-by: Stephen Hemminger <step...@networkplumber.org>
> Reviewed-by: Andrew Rybchenko <arybche...@solarflare.com>
> ---
>  lib/librte_net/rte_ether.h | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
> index 8edc7e217b25..feb35a33c94b 100644
> --- a/lib/librte_net/rte_ether.h
> +++ b/lib/librte_net/rte_ether.h
> @@ -81,11 +81,10 @@ struct rte_ether_addr {
>  static inline int rte_is_same_ether_addr(const struct rte_ether_addr *ea1,
>                                    const struct rte_ether_addr *ea2)
>  {
> -     int i;
> -     for (i = 0; i < RTE_ETHER_ADDR_LEN; i++)
> -             if (ea1->addr_bytes[i] != ea2->addr_bytes[i])
> -                     return 0;
> -     return 1;
> +     const unaligned_uint16_t *w1 = (const uint16_t *)ea1;
> +     const unaligned_uint16_t *w2 = (const uint16_t *)ea2;
> +
> +     return ((w1[0] ^ w2[0]) | (w1[1] ^ w2[1]) | (w1[2] ^ w2[2])) == 0;
>  }
>  
>  /**
> @@ -100,11 +99,9 @@ static inline int rte_is_same_ether_addr(const struct 
> rte_ether_addr *ea1,
>   */
>  static inline int rte_is_zero_ether_addr(const struct rte_ether_addr *ea)
>  {
> -     int i;
> -     for (i = 0; i < RTE_ETHER_ADDR_LEN; i++)
> -             if (ea->addr_bytes[i] != 0x00)
> -                     return 0;
> -     return 1;
> +     const unaligned_uint16_t *w = (const uint16_t *)ea;
> +
> +     return (w[0] | w[1] | w[2]) == 0;
>  }
>  
>  /**


I wonder if using memcmp() isn't faster with recent compilers (gcc >= 7).
I tried it quickly, and it seems the generated code is good (no call):
https://godbolt.org/z/9MOL7g

It would avoid the use of unaligned_uint16_t, and the next patch that
adds the alignment constraint.

Re: [dpdk-dev] [PATCH v5 4/8] net/ether: use bitops to speedup comparison

Reply via email to