> On 10. Apr 2021, at 23:59, Rick Macklem <rmack...@uoguelph.ca> wrote: > > tue...@freebsd.org wrote: >> Rick wrote: > [stuff snipped] >>>> With r367492 you don't get the upcall with the same error state? Or you >>>> don't get an error on a write() call, when there should be one? >> If Send-Q is 0 when the network is partitioned, after healing, the krpc sees >> no activity on >> the socket (until it acquires/processes an RPC it will not do a sosend()). >> Without the 6minute timeout, the RST battle goes on "forever" (I've never >> actually >> waited more than 30minutes, which is close enough to "forever" for me). >> --> With the 6minute timeout, the "battle" stops after 6minutes, when the >> timeout >> causes a soshutdown(..SHUT_WR) on the socket. >> (Since the soshutdown() patch is not yet in "main". I got comments, but >> no "reviewed" >> on it, the 6minute timer won't help if enabled in main. The soclose() >> won't happen >> for TCP connections with the back channel enabled, such as Linux >> 4.1/4.2 ones.) >> I'm confused. So you are saying that if the Send-Q is empty when you >> partition the >> network, and the peer starts to send SYNs after the healing, FreeBSD responds >> with a challenge ACK which triggers the sending of a RST by Linux. This RST >> is >> ignored multiple times. >> Is that true? Even with my patch for the the bug I introduced? > Yes and yes. > Go take another look at linuxtofreenfs.pcap > ("fetch https://people.freebsd.org/~rmacklem/linuxtofreenfs.pcap" if you don't > already have it.) > Look at packet #1949->2069. I use wireshark, but you'll have your favourite. > You'll see the "RST battle" that ends after > 6minutes at packet#2069. If there is no 6minute timeout enabled in the > server side krpc, then the battle just continues (I once let it run for about > 30minutes before giving up). The 6minute timeout is not currently enabled > in main, etc. Hmm. I don't understand why r367492 can impact the processing of the RST, which basically destroys the TCP connection.
Richard: Can you explain that? Best regards Michael > >> What version of the kernel are you using? > "main" dated Dec. 23, 2020 + your bugfix + assorted NFS patches that > are not relevant + 2 small krpc related patches. > --> The two small krpc related patches enable the 6minute timeout and > add a soshutdown(..SHUT_WR) call when the 6minute timeout is > triggered. These have no effect until the 6minutes is up and, without > them the "RTS battle" goes on forever. > > Add to the above a revert of r367492 and the RST battle goes away and things > behave as expected. The recovery happens quickly after the network is > unpartitioned, with either 0 or 1 RSTs. > > rick > ps: Once the irrelevant NFS patches make it into "main", I will upgrade to > main bits-de-jur for testing. > > Best regards > Michael >> >> If Send-Q is non-empty when the network is partitioned, the battle will not >> happen. >> >>> >>> My understanding is that he needs this error indication when calling >>> shutdown(). >> There are several ways the krpc notices that a TCP connection is no longer >> functional. >> - An error return like EPIPE from either sosend() or soreceive(). >> - A return of 0 from soreceive() with no data (normal EOF from other end). >> - A 6minute timeout on the server end, when no activity has occurred on the >> connection. This timer is currently disabled for NFSv4.1/4.2 mounts in >> "main", >> but I enabled it for this testing, to stop the "RST battle goes on forever" >> during testing. I am thinking of enabling it on "main", but this crude >> bandaid >> shouldn't be thought of as a "fix for the RST battle". >> >>>> >>>> From what you describe, this is on writes, isn't it? (I'm asking, at the >>>> original problem that was fixed with r367492, occurs in the read path >>>> (draining of ths so_rcv buffer in the upcall right away, which >>>> subsequently influences the ACK sent by the stack). >>>> >>>> I only added the so_snd buffer after some discussion, if the WAKESOR >>>> shouldn't have a symmetric equivalent on WAKESOW.... >>>> >>>> Thus a partial backout (leaving the WAKESOR part inside, but reverting the >>>> WAKESOW part) would still fix my initial problem about erraneous DSACKs >>>> (which can also lead to extremely poor performance with Linux clients), >>>> but possible address this issue... >>>> >>>> Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for >>>> the revert only on the so_snd upcall? >> Since the krpc only uses receive upcalls, I don't see how reverting the send >> side would have >> any effect? >> >>> Since the release of 13.0 is almost done, can we try to fix the issue >>> instead of reverting the commit? >> I think it has already shipped broken. >> I don't know if an errata is possible, or if it will be broken until 13.1. >> >> --> I am much more concerned with the otis@ stuck client problem than this >> RST battle that only >> occurs after a network partitioning, especially if it is 13.0 specific. >> I did this testing to try to reproduce Jason's stuck client (with >> connection in CLOSE_WAIT) >> problem, which I failed to reproduce. >> >> rick >> >> Rs: agree, a good understanding where the interaction btwn stack, socket and >> in kernel tcp user breaks is needed; >> >>> >>> If this doesn't help, some major surgery will be necessary to prevent NFS >>> sessions with SACK enabled, to transmit DSACKs... >> >> My understanding is that the problem is related to getting a local error >> indication after >> receiving a RST segment too late or not at all. >> >> Rs: but the move of the upcall should not materially change that; i don’t >> have a pc here to see if any upcall actually happens on rst... >> >> Best regards >> Michael >>> >>> >>>> I know from a printf that this happened, but whether it caused the RST >>>> battle to not happen, I don't know. >>>> >>>> I can put r367492 back in and do more testing if you'd like, but I think >>>> it probably needs to be reverted? >>> >>> Please, I don't quite understand why the exact timing of the upcall would >>> be that critical here... >>> >>> A comparison of the soxxx calls and errors between the "good" and the "bad" >>> would be perfect. I don't know if this is easy to do though, as these calls >>> appear to be scattered all around the RPC / NFS source paths. >>> >>>> This does not explain the original hung Linux client problem, but does >>>> shed light on the RST war I could create by doing a network partitioning. >>>> >>>> rick >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"