Hello,

        I'm having an issue with a FreeBSD 11 based system, sending 
sporadically TCP/RST to clients after initial TCP session correctly initiated.
        The sequence goes this way :

        1 Client -> Server : SYN
        2 Server -> Client : SYN/ACK
        3 Client -> Server : ACK
        4 Client -> Server : PSH/ACK (upper protocol data sending starts here)
        5 Server -> Client : RST
        
        - The problem happens sporadically, same client and same server can 
communicate smoothely on the same service port. But from time to time (hours, 
sometime days) the previous sequence happens.
        - The service running on server is not responsible for the RST sent. 
The service was deeply profiled and nothing happens to justify the RST.
        - tcpdump on the server side assures that packet arrives timely ordered.
        - the traffic is very light. Some TCP sessions per day.
        - the server is connected using a lagg enslaving two cxgb interfaces.

        In my effort to diagnose the problem (try to have a reproductible test 
case) I noticed that the issue is triggered most likely when those two 
conditions are met :
        - the ACK (in step 3) and the PSH/ACK (in step 4) arrive on different 
lagg NICs.
        - the timing between those two packets is sub 10 microseconds.

        When searching the interwebs I came across a strangely similar issue 
reported here 7 years ago :
        https://lists.freebsd.org/pipermail/freebsd-net/2010-August/026029.html

        (The OP seemed to have resolved his issue changing the netisr policy 
from direct to hybrid. but no reference of laggs being used)

        I'm pretty sure that I'm hitting some race condition, a scenario where 
due to multithreading the PSH/ACK is somehow handled before the ACK making the 
kernel rising TCP/RST since the initial TCP handshake did'nt finish yet.

        I've read about netisr work and I was under the impression that even if 
it's SMP enabled it was made to keep prorocol ordering.

        What's the expected behaviour in this scenario on the netisr side ?
        How can I push the investigation further ?

Youssef Ghorbal


_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to