Re: NFS Mount Hangs

tuexen Sun, 11 Apr 2021 05:30:23 -0700

> On 10. Apr 2021, at 23:59, Rick Macklem <rmack...@uoguelph.ca> wrote:
> 
> tue...@freebsd.org wrote:
>> Rick wrote:
> [stuff snipped]
>>>> With r367492 you don't get the upcall with the same error state? Or you 
>>>> don't get an error on a write() call, when there should be one?
>> If Send-Q is 0 when the network is partitioned, after healing, the krpc sees 
>> no activity on
>> the socket (until it acquires/processes an RPC it will not do a sosend()).
>> Without the 6minute timeout, the RST battle goes on "forever" (I've never 
>> actually
>> waited more than 30minutes, which is close enough to "forever" for me).
>> --> With the 6minute timeout, the "battle" stops after 6minutes, when the 
>> timeout
>>     causes a soshutdown(..SHUT_WR) on the socket.
>>     (Since the soshutdown() patch is not yet in "main". I got comments, but 
>> no "reviewed"
>>      on it, the 6minute timer won't help if enabled in main. The soclose() 
>> won't happen
>>      for TCP connections with the back channel enabled, such as Linux 
>> 4.1/4.2 ones.)
>> I'm confused. So you are saying that if the Send-Q is empty when you 
>> partition the
>> network, and the peer starts to send SYNs after the healing, FreeBSD responds
>> with a challenge ACK which triggers the sending of a RST by Linux. This RST 
>> is
>> ignored multiple times.
>> Is that true? Even with my patch for the the bug I introduced?
> Yes and yes.
> Go take another look at linuxtofreenfs.pcap
> ("fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap"; if you don't
>  already have it.)
> Look at packet #1949->2069. I use wireshark, but you'll have your favourite.
> You'll see the "RST battle" that ends after
> 6minutes at packet#2069. If there is no 6minute timeout enabled in the
> server side krpc, then the battle just continues (I once let it run for about
> 30minutes before giving up). The 6minute timeout is not currently enabled
> in main, etc.
Hmm. I don't understand why r367492 can impact the processing of the RST, which
basically destroys the TCP connection.


Richard: Can you explain that?

Best regards
Michael
> 
>> What version of the kernel are you using?
> "main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that
> are not relevant + 2 small krpc related patches.
> --> The two small krpc related patches enable the 6minute timeout and
>       add a soshutdown(..SHUT_WR) call when the 6minute timeout is
>       triggered. These have no effect until the 6minutes is up and, without
>       them the "RTS battle" goes on forever.
> 
> Add to the above a revert of r367492 and the RST battle goes away and things
> behave as expected. The recovery happens quickly after the network is
> unpartitioned, with either 0 or 1 RSTs.
> 
> rick
> ps: Once the irrelevant NFS patches make it into "main", I will upgrade to
>     main bits-de-jur for testing.
> 
> Best regards
> Michael
>> 
>> If Send-Q is non-empty when the network is partitioned, the battle will not 
>> happen.
>> 
>>> 
>>> My understanding is that he needs this error indication when calling 
>>> shutdown().
>> There are several ways the krpc notices that a TCP connection is no longer 
>> functional.
>> - An error return like EPIPE from either sosend() or soreceive().
>> - A return of 0 from soreceive() with no data (normal EOF from other end).
>> - A 6minute timeout on the server end, when no activity has occurred on the
>> connection. This timer is currently disabled for NFSv4.1/4.2 mounts in 
>> "main",
>> but I enabled it for this testing, to stop the "RST battle goes on forever"
>> during testing. I am thinking of enabling it on "main", but this crude 
>> bandaid
>> shouldn't be thought of as a "fix for the RST battle".
>> 
>>>> 
>>>> From what you describe, this is on writes, isn't it? (I'm asking, at the 
>>>> original problem that was fixed with r367492, occurs in the read path 
>>>> (draining of ths so_rcv buffer in the upcall right away, which 
>>>> subsequently influences the ACK sent by the stack).
>>>> 
>>>> I only added the so_snd buffer after some discussion, if the WAKESOR 
>>>> shouldn't have a symmetric equivalent on WAKESOW....
>>>> 
>>>> Thus a partial backout (leaving the WAKESOR part inside, but reverting the 
>>>> WAKESOW part) would still fix my initial problem about erraneous DSACKs 
>>>> (which can also lead to extremely poor performance with Linux clients), 
>>>> but possible address this issue...
>>>> 
>>>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for 
>>>> the revert only on the so_snd upcall?
>> Since the krpc only uses receive upcalls, I don't see how reverting the send 
>> side would have
>> any effect?
>> 
>>> Since the release of 13.0 is almost done, can we try to fix the issue 
>>> instead of reverting the commit?
>> I think it has already shipped broken.
>> I don't know if an errata is possible, or if it will be broken until 13.1.
>> 
>> --> I am much more concerned with the otis@ stuck client problem than this 
>> RST battle that only
>>      occurs after a network partitioning, especially if it is 13.0 specific.
>>      I did this testing to try to reproduce Jason's stuck client (with 
>> connection in CLOSE_WAIT)
>>      problem, which I failed to reproduce.
>> 
>> rick
>> 
>> Rs: agree, a good understanding where the interaction btwn stack, socket and 
>> in kernel tcp user breaks is needed;
>> 
>>> 
>>> If this doesn't help, some major surgery will be necessary to prevent NFS 
>>> sessions with SACK enabled, to transmit DSACKs...
>> 
>> My understanding is that the problem is related to getting a local error 
>> indication after
>> receiving a RST segment too late or not at all.
>> 
>> Rs: but the move of the upcall should not materially change that; i don’t 
>> have a pc here to see if any upcall actually happens on rst...
>> 
>> Best regards
>> Michael
>>> 
>>> 
>>>> I know from a printf that this happened, but whether it caused the RST 
>>>> battle to not happen, I don't know.
>>>> 
>>>> I can put r367492 back in and do more testing if you'd like, but I think 
>>>> it probably needs to be reverted?
>>> 
>>> Please, I don't quite understand why the exact timing of the upcall would 
>>> be that critical here...
>>> 
>>> A comparison of the soxxx calls and errors between the "good" and the "bad" 
>>> would be perfect. I don't know if this is easy to do though, as these calls 
>>> appear to be scattered all around the RPC / NFS source paths.
>>> 
>>>> This does not explain the original hung Linux client problem, but does 
>>>> shed light on the RST war I could create by doing a network partitioning.
>>>> 
>>>> rick
>>> 
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: NFS Mount Hangs

Reply via email to