Re: [PATCH] reduce the spinlock conflict during massive connect

Eric Dumazet Sun, 05 Nov 2017 21:28:04 -0800

On Mon, 2017-11-06 at 10:28 +0800, Liu Yu wrote:
> From: Liu Yu <allanyu...@tencent.com>
> 
> When a mount of processes connect to the same port at the same address
> simultaneously, they are likely getting the same bhash and therefore
> conflict with each other.
> 
> The more the cpu number, the worse in this case.
> 
> Use spin_trylock instead for this scene, which seems doesn't matter
> for common case.
> 
> Signed-off-by: Liu Yu <allanyu...@tencent.com>
> ---
>  net/ipv4/inet_hashtables.c |    6 +++++-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index e7d15fb..cc11ec7 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -581,13 +581,17 @@ int __inet_hash_connect(struct inet_timewait_death_row 
> *death_row,
>  other_parity_scan:
>       port = low + offset;
>       for (i = 0; i < remaining; i += 2, port += 2) {
> +             int ret;
> +
>               if (unlikely(port >= high))
>                       port -= remaining;
>               if (inet_is_local_reserved_port(net, port))
>                       continue;
>               head = &hinfo->bhash[inet_bhashfn(net, port,
>                                                 hinfo->bhash_size)];
> -             spin_lock_bh(&head->lock);
> +             ret = spin_trylock(&head->lock);
> +             if (unlikely(!ret))
> +                     continue;
>  
>               /* Does not bother with rcv_saddr checks, because
>                * the established check is already unique enough.


This is broken.

I am pretty sure you have not really tested this patch properly.

Chances are very high that a connect() will miss slots and wont succeed,
when table is almost full.

Performance is nice, but we actually need to allocate a 4-tuple in a
more deterministic fashion.

Re: [PATCH] reduce the spinlock conflict during massive connect

Reply via email to