Re: 11.0 stuck on high network load

Julien Charbon Tue, 11 Oct 2016 00:21:07 -0700

 Hi Slawa,

On 10/10/16 7:35 PM, Slawa Olhovchenkov wrote:
> On Mon, Oct 10, 2016 at 05:44:21PM +0200, Julien Charbon wrote:
>>>> can check the current other usages of goto findpcb in tcp_input().  The
>>>> rational here being:
>>>>
>>>>  - Behavior before the patch:  If the inp we found was deleted then goto
>>>> findpcb.
>>>>  - Behavior after the patch:  If the inp we found was deleted or dropped
>>>> then goto findpcb.
>>>>
>>>>  I just prefer having the same behavior applied everywhere:  If
>>>> tcp_input() loses the inp lock race and the inp was deleted or dropped
>>>> then retry to find a new inpcb to deliver to.
>>>>
>>>>  But you are right dropping the packet here will also fix the issue.
>>>>
>>>>  Then the review process becomes quite helpful because people can argue:
>>>>  Dropping here is better because "blah", or goto findpcb is better
>>>> because "bluh", etc.  And at the review end you have a nice final patch.
>>>>
>>>> https://reviews.freebsd.org/D8211
>>>
>>> I am not sure, I am see to
>>>
>>> sys/netinet/in_pcb.h:#define    INP_DROPPED             0x04000000 /*
>> protocol drop flag */
>>>
>>> and think this is a flag 'all packets must be droped'
>>
>>  Hm, I believe this flag means "this inp has been dropped by the TCP
>> stack, so don't use it anymore".  Actually this flag is better described
>> in the function that sets it:
>>
>> "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid
>> future packet delivery or event notification when a socket remains open
>> but TCP has closed."
>>
>> https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1320
>>
>> /*
>>  * in_pcbdrop() removes an inpcb from hashed lists, releasing its
>> address and
>>  * port reservation, and preventing it from being returned by inpcb lookups.
>>  *
>>  * It is used by TCP to mark an inpcb as unused and avoid future packet
>>  * delivery or event notification when a socket remains open but TCP has
>>  * closed.  This might occur as a result of a shutdown()-initiated TCP close
>>  * or a RST on the wire, and allows the port binding to be reused while
>> still
>>  * maintaining the invariant that so_pcb always points to a valid inpcb
>> until
>>  * in_pcbdetach().
>>  *
>>  */
>> void
>> in_pcbdrop(struct inpcb *inp)
>> {
>>   inp->inp_flags |= INP_DROPPED;
>>   ...
>>
>>  The classical example where "goto findpcb" is useful:  You receive a
>> new connection request with a TCP SYN packet and this packet is unlucky
>> and reached a inp being dropped:
>>
>>  - with "goto findpcb" approach, the next lookup will most likely find
>> the LISTEN inp and start the TCP hand-shake as usual
>>  - with "drop the packet" approach, the TCP client will need to
>> re-transmit a TCP SYN packet
>>
>>  It is not because a packet was unlucky once that it deserves to be
>> dropped. :)
> 
> Thanks for explaining, very helpfull.
> In this situation (TCP SYN with same 4-tuple as existing socket)
> allocate new PCB is best. But for this we must destroy current PCB. I
> am think INP_WUNLOCK(inp) don't destroy it and in_pcblookup_mbuf find
> it again (I am think in_pcblookup_mbuf find this PCB on first turn).
> I am assume for classical example in_pcbrele_wlocked(inp) free and
> destroy current PCB for possibility in_pcblookup_mbuf allocate new
> one.


 Astute question:  Here, the same inp cannot be find again by
in_pcblookup_mbuf(), the explanation is a bit long though:

in_pcbdrop() does two things under INP_WLOCK lock:
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1334

1. Add INP_DROPPED flag
2. Remove the inp from the TCP hash table

 And once removed from the TCP hash table, in_pcblookup_mbuf() will
return NULL when doing the same TCP hash table lookup again.

 It means that under a INP_WLOCK lock, these two changes are atomic:

 - Either an inp does not have INP_DROPPED flag and can be in TCP hash table
 - Either an inp has INP_DROPPED flag and is not in TCP hash table

 But when you don't have the INP_WLOCK lock, then you can witness the
intermediate state where a inp is still in TCP hash table while a thread
is adding the INP_DROPPED flag.

 Nothing unusual here.

 Then threads are competing for the INP_WLOCK lock.  For the example,
let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and
racing for this INP_WLOCK:

https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964

 And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and
racing for this INP_WLOCK:

https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323

 That leads to two cases:

 o Thread A wins the race:

  Thread A will continue tcp_input() as usal and INP_DROPPED flags is
not set and inp is still in TCP hash table.
  Thread B is waiting on thread A to release INP_WLOCK after finishing
tcp_input() processing, and thread B will continue
tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing.

 o Thread B wins the race:

  Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp
INP_DROPPED is set and inp being removed from TCP hash table.
  In parallel, thread A has found the inp in TCP hash before is was
removed, and waiting on the found inp INP_WLOCK lock.
  Once thread B has released the INP_WLOCK lock, thread A gets this lock
and sees the INP_DROPPED flag and do "goto findpcb" but here because the
inp is not more in TCP hash table and it will not be find again by
in_pcblookup_mbuf().

 Hopefully I am clear enough here.

--
Julien

signature.asc
Description: OpenPGP digital signature

Re: 11.0 stuck on high network load

Reply via email to