> On 10. Apr 2021, at 11:19, Scheffenegger, Richard > <richard.scheffeneg...@netapp.com> wrote: > > Hi Rick, > >> Well, I have some good news and some bad news (the bad is mostly for >> Richard). >> >> The only message logged is: >> tcpflags 0x4<RST>; tcp_do_segment: Timestamp missing, segment processed >> normally >> >> But...the RST battle no longer occurs. Just one RST that works and then the >> SYN gets SYN,ACK'd by the FreeBSD end and off it goes... >> >> So, what is different? >> >> r367492 is reverted from the FreeBSD server. >> I did the revert because I think it might be what otis@ hang is being caused >> by. (In his case, the Recv-Q grows on the socket for the stuck Linux client, >> while others work. >> >> Why does reverting fix this? >> My only guess is that the krpc gets the upcall right away and sees a EPIPE >> when it does soreceive()->results in soshutdown(SHUT_WR). > > With r367492 you don't get the upcall with the same error state? Or you don't > get an error on a write() call, when there should be one? My understanding is that he needs this error indication when calling shutdown(). > > From what you describe, this is on writes, isn't it? (I'm asking, at the > original problem that was fixed with r367492, occurs in the read path > (draining of ths so_rcv buffer in the upcall right away, which subsequently > influences the ACK sent by the stack). > > I only added the so_snd buffer after some discussion, if the WAKESOR > shouldn't have a symmetric equivalent on WAKESOW.... > > Thus a partial backout (leaving the WAKESOR part inside, but reverting the > WAKESOW part) would still fix my initial problem about erraneous DSACKs > (which can also lead to extremely poor performance with Linux clients), but > possible address this issue... > > Can you perhaps take MAIN and apply https://reviews.freebsd.org/D29690 for > the revert only on the so_snd upcall? Since the release of 13.0 is almost done, can we try to fix the issue instead of reverting the commit? > > If this doesn't help, some major surgery will be necessary to prevent NFS > sessions with SACK enabled, to transmit DSACKs... My understanding is that the problem is related to getting a local error indication after receiving a RST segment too late or not at all.
Best regards Michael > > >> I know from a printf that this happened, but whether it caused the RST >> battle to not happen, I don't know. >> >> I can put r367492 back in and do more testing if you'd like, but I think it >> probably needs to be reverted? > > Please, I don't quite understand why the exact timing of the upcall would be > that critical here... > > A comparison of the soxxx calls and errors between the "good" and the "bad" > would be perfect. I don't know if this is easy to do though, as these calls > appear to be scattered all around the RPC / NFS source paths. > >> This does not explain the original hung Linux client problem, but does shed >> light on the RST war I could create by doing a network partitioning. >> >> rick > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"