AW: NFS Mount Hangs

2021-04-12 Thread Scheffenegger, Richard
8866 1857 Mobile Phone richard.scheffeneg...@netapp.com https://ts.la/richard49892 -Ursprüngliche Nachricht- Von: Rick Macklem Gesendet: Montag, 12. April 2021 00:50 An: Scheffenegger, Richard ; tue...@freebsd.org Cc: Youssef GHORBAL ; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hang

Re: NFS Mount Hangs

2021-04-11 Thread Rick Macklem
ussef GHORBAL; freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to ith...@uoguel

Re: NFS Mount Hangs

2021-04-11 Thread Scheffenegger, Richard
eff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. > On 10. Apr 2021, at 23:59, Rick Macklem wrote: > > tue...@freebsd.org wrote: >> Rick wrote: > [st

Re: NFS Mount Hangs

2021-04-11 Thread tuexen
> On 10. Apr 2021, at 23:59, Rick Macklem wrote: > > tue...@freebsd.org wrote: >> Rick wrote: > [stuff snipped] With r367492 you don't get the upcall with the same error state? Or you don't get an error on a write() call, when there should be one? >> If Send-Q is 0 when the network is

Re: NFS Mount Hangs

2021-04-10 Thread Rick Macklem
tue...@freebsd.org wrote: >Rick wrote: [stuff snipped] >>> With r367492 you don't get the upcall with the same error state? Or you >>> don't get an error on a write() call, when there should be one? > If Send-Q is 0 when the network is partitioned, after healing, the krpc sees > no activity on >

AW: NFS Mount Hangs

2021-04-10 Thread Scheffenegger, Richard
bile Phone richard.scheffeneg...@netapp.com https://ts.la/richard49892 -Ursprüngliche Nachricht- Von: tue...@freebsd.org Gesendet: Samstag, 10. April 2021 18:13 An: Rick Macklem Cc: Scheffenegger, Richard ; Youssef GHORBAL ; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hangs Ne

Re: NFS Mount Hangs

2021-04-10 Thread tuexen
> On 10. Apr 2021, at 17:56, Rick Macklem wrote: > > Scheffenegger, Richard wrote: >>> Rick wrote: >>> Hi Rick, >>> Well, I have some good news and some bad news (the bad is mostly for Richard). The only message logged is: tcpflags 0x4; tcp_do_segment: Timestamp missi

Re: NFS Mount Hangs

2021-04-10 Thread Rick Macklem
Scheffenegger, Richard wrote: >>Rick wrote: >> Hi Rick, >> >>> Well, I have some good news and some bad news (the bad is mostly for >>> Richard). >>> >>> The only message logged is: >>> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed >>> normally >>> Btw, I did get one additio

Re: NFS Mount Hangs

2021-04-10 Thread tuexen
ike it is for NFSv4.1 in freebsd-current. >>>>I had forgotten to re-disable it. >>>> So, when it does battle, it might have been the 6minute >>>> timeout, which would then do the soshutdown(..SHUT_WR) >>>> which kept it from getting "stuck&q

Re: NFS Mount Hangs

2021-04-10 Thread Rick Macklem
pcap for this one, started after the network was plugged >>> back in and I noticed it was stuck for quite a while is here: >>> fetch https://people.freebsd.org/~rmacklem/stuck.pcap >>> >>> In it, there is just a bunch of RST followed by SYN sent >>> from cl

Re: NFS Mount Hangs

2021-04-10 Thread Scheffenegger, Richard
Von: tue...@freebsd.org Gesendet: Samstag, April 10, 2021 2:19 PM An: Scheffenegger, Richard Cc: Rick Macklem; Youssef GHORBAL; freebsd-net@freebsd.org Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or open

Re: NFS Mount Hangs

2021-04-10 Thread tuexen
> On 10. Apr 2021, at 11:19, Scheffenegger, Richard > wrote: > > Hi Rick, > >> Well, I have some good news and some bad news (the bad is mostly for >> Richard). >> >> The only message logged is: >> tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed >> normally >> >> But...th

Re: NFS Mount Hangs

2021-04-10 Thread tuexen
ent->FreeBSD and FreeBSD just keeps sending >>> acks for the old segment back. >>> --> It looks like FreeBSD did the "RST, ACK" after the >>> krpc did a soshutdown(..SHUT_WR) on the socket, >>> for the one you've been looking at. >

AW: NFS Mount Hangs

2021-04-10 Thread Scheffenegger, Richard
Hi Rick, > Well, I have some good news and some bad news (the bad is mostly for Richard). > > The only message logged is: > tcpflags 0x4; tcp_do_segment: Timestamp missing, segment processed > normally > > But...the RST battle no longer occurs. Just one RST that works and then the > SYN gets SYN

Re: NFS Mount Hangs

2021-04-09 Thread Rick Macklem
shutdown(..SHUT_WR) on the socket, >> for the one you've been looking at. >> I'll test some more... >> >>> I would like to understand why the reestablishment of the connection >>> did not work... >> It is looking like it takes either a non-empt

Re: NFS Mount Hangs

2021-04-08 Thread Rick Macklem
c did a soshutdown(..SHUT_WR) on the socket, >> for the one you've been looking at. >> I'll test some more... >> >>> I would like to understand why the reestablishment of the connection >>> did not work... >> It is looking like it takes eit

Re: NFS Mount Hangs

2021-04-08 Thread Peter Eriksson
n >>> did not work... >> It is looking like it takes either a non-empty send-q or a >> soshutdown(..SHUT_WR) to get the FreeBSD socket >> out of established, where it just ignores the RSTs and >> SYN packets. >> >> Thanks for looking at it, rick >>

Re: NFS Mount Hangs

2021-04-06 Thread tuexen
for the one you've been looking at. >> I'll test some more... >> >>> I would like to understand why the reestablishment of the connection >>> did not work... >> It is looking like it takes either a non-empty send-q or a >> soshutdown(..SHUT_WR) to get the FreeBSD socket >> out of est

Re: NFS Mount Hangs

2021-04-05 Thread Rick Macklem
s for looking at it, rick > > Best regards > Michael >> >> Have fun with it, rick >> >> >> >> From: tue...@freebsd.org >> Sent: Sunday, April 4, 2021 12:41 PM >> To: Rick Macklem >> Cc: Scheff

Re: NFS Mount Hangs

2021-04-05 Thread tuexen
ike to understand why the reestablishment of the connection >> did not work... > It is looking like it takes either a non-empty send-q or a > soshutdown(..SHUT_WR) to get the FreeBSD socket > out of established, where it just ignores the RSTs and > SYN packets. > > Thanks for looking at it,

Re: NFS Mount Hangs

2021-04-04 Thread Rick Macklem
s and SYN packets. Thanks for looking at it, rick Best regards Michael > > Have fun with it, rick > > > ________ > From: tue...@freebsd.org > Sent: Sunday, April 4, 2021 12:41 PM > To: Rick Macklem > Cc: Scheffenegger, Richard; Youssef GHO

Re: NFS Mount Hangs

2021-04-04 Thread tuexen
ue...@freebsd.org > Sent: Sunday, April 4, 2021 12:41 PM > To: Rick Macklem > Cc: Scheffenegger, Richard; Youssef GHORBAL; freebsd-net@freebsd.org > Subject: Re: NFS Mount Hangs > > CAUTION: This email originated from outside of the University of Guelph. Do > not click links o

Re: NFS Mount Hangs

2021-04-04 Thread Rick Macklem
S val 2074098279 ecr 2671667056], length 48: NFS reply xid > 697039765 reply ok 44 getattr ERROR: unk 10063 > > This error 10063 after the partition heals is also "bad news". It indicates > the Session > (which is supposed to maintain "exactly once" RPC semantics is

Re: NFS Mount Hangs

2021-04-04 Thread Rick Macklem
ebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to ith...@uoguelph.ca > On

Re: NFS Mount Hangs

2021-04-04 Thread Rodney W. Grimes
reply ok 44 getattr ERROR: unk 10063 > > This error 10063 after the partition heals is also "bad news". It indicates > the Session > (which is supposed to maintain "exactly once" RPC semantics is broken). I'll > admit I > suspect a Linux client bug, but will

Re: NFS Mount Hangs

2021-04-04 Thread tuexen
gt; the Session > (which is supposed to maintain "exactly once" RPC semantics is broken). I'll > admit I > suspect a Linux client bug, but will be investigating further. > > So, hopefully TCP conversant folk can confirm if the above is correct > behaviour >

Re: NFS Mount Hangs

2021-04-04 Thread Rick Macklem
nfirm if the above is correct behaviour or if the RST should be ack'd sooner? I could also see this becoming a "forever" TCP battle for other versions of Linux client. rick ________ From: Scheffenegger, Richard Sent: Sunday, April 4, 2021 7:50 AM To: Rick Macklem; tue.

Re: NFS Mount Hangs

2021-04-04 Thread Scheffenegger, Richard
-net@freebsd.org Betreff: Re: NFS Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. tue...@freebsd.org wrote: >> On 2. Apr 2021, at 02:07, Rick Macklem wrote: >> &

Re: NFS Mount Hangs

2021-04-02 Thread Rick Macklem
igning the back channel. Thanks for your help with this Michael, rick Best regards Michael > > rick > ps: I can capture packets while doing this, if anyone has a use > for them. > > > > > > > > From: owner-freebsd-...@

Re: NFS Mount Hangs

2021-04-02 Thread Rick Macklem
TCP connection gets stuck in CLOSE_WAIT and that is why I've added the soshutdown(..SHUT_WR) calls, which can happen before the client gets around to re-assigning the back channel. Thanks for your help with this Michael, rick Best regards Michael > > rick > ps: I can capture packets w

Re: NFS Mount Hangs

2021-04-02 Thread tuexen
them. > > > > > > > > From: owner-freebsd-...@freebsd.org on behalf > of Youssef GHORBAL > Sent: Saturday, March 27, 2021 6:57 PM > To: Jason Breitman > Cc: Rick Macklem; freebsd-net@freebsd.org > Subject: Re

Re: NFS Mount Hangs

2021-04-01 Thread Rick Macklem
AL Sent: Saturday, March 27, 2021 6:57 PM To: Jason Breitman Cc: Rick Macklem; freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content i

Re: NFS Mount Hangs

2021-03-27 Thread Youssef GHORBAL
On 27 Mar 2021, at 13:20, Jason Breitman mailto:jbreit...@tildenparkcapital.com>> wrote: The issue happened again so we can say that disabling TSO and LRO on the NIC did not resolve this issue. # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso # ifconfig lagg0 lagg0: flag

Re: NFS Mount Hangs

2021-03-27 Thread Jason Breitman
The issue happened again so we can say that disabling TSO and LRO on the NIC did not resolve this issue. # ifconfig lagg0 -rxcsum -rxcsum6 -txcsum -txcsum6 -lro -tso -vlanhwtso # ifconfig lagg0 lagg0: flags=8943 metric 0 mtu 1500 options=8100b8 We can also say that the sysctl settings d

Re: NFS Mount Hangs

2021-03-22 Thread Rick Macklem
behalf of Jason Breitman Sent: Monday, March 22, 2021 9:24 AM To: Youssef GHORBAL Cc: freebsd-net@freebsd.org Subject: Re: NFS Mount Hangs CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the

Re: NFS Mount Hangs

2021-03-22 Thread Jason Breitman
Agreed. I had made the changes on the FreeBSD Server side and was suggesting that a new TCP connection needed to be established between the client and server for the settings to take effect. I rebooted all of my Debian clients on Sunday to achieve that goal, establishing a new NFSv4 TCP connect

Re: NFS Mount Hangs

2021-03-22 Thread Youssef GHORBAL
> On 21 Mar 2021, at 23:21, Rick Macklem wrote: > > Youssef GHORBAL wrote: >> Hi Jason, >> >>> On 17 Mar 2021, at 18:17, Jason Breitman >>> wrote: >>> >>> Please review the details below and let me know if there is a setting that >>> I should apply to my FreeBSD NFS Server or if there is

Re: NFS Mount Hangs

2021-03-22 Thread Youssef GHORBAL
> On 21 Mar 2021, at 14:41, Jason Breitman > wrote: > > Thanks for sharing as this sounds exactly like my issue. > > I had implemented the change below on 3/8/2021 and have experienced the NFS > hang after that. > Do I need to reboot or umount / mount all of the clients and then I will be >

Re: NFS Mount Hangs

2021-03-21 Thread Rick Macklem
Youssef GHORBAL wrote: >Hi Jason, > >> On 17 Mar 2021, at 18:17, Jason Breitman >> wrote: >> >> Please review the details below and let me know if there is a setting that I >> should apply to my FreeBSD NFS Server or if there is a bug fix that I can >> apply to resolve my issue. >> I shared t

Re: NFS Mount Hangs

2021-03-21 Thread Jason Breitman
Thanks for sharing as this sounds exactly like my issue. I had implemented the change below on 3/8/2021 and have experienced the NFS hang after that. Do I need to reboot or umount / mount all of the clients and then I will be ok? I had not rebooted the clients, but would to get out of this situa

Re: NFS Mount Hangs

2021-03-21 Thread Jason Breitman
The issue did trigger again. I ran the script below for ~15 minutes and hope this gets you what you need. Let me know if you require the full output without grepping nfsd. #!/bin/sh while true do /bin/date >> /tmp/nfs-hang.log /bin/ps axHl | grep nfsd | grep -v grep >> /tmp/nfs-hang.log

Re: NFS Mount Hangs

2021-03-19 Thread Youssef GHORBAL
Hi Jason, > On 17 Mar 2021, at 18:17, Jason Breitman > wrote: > > Please review the details below and let me know if there is a setting that I > should apply to my FreeBSD NFS Server or if there is a bug fix that I can > apply to resolve my issue. > I shared this information with the linux-nf

Re: NFS Mount Hangs

2021-03-19 Thread Rick Macklem
Scheffenegger, Richard wrote: >Sorry, I though this was a problem on stable/13. > >This is only in HEAD, stable/13 and 13.0 - never MFC'd to stable/12 or >backported to >12.1 > >> I did some reshuffling of socket-upcalls recently in the TCP stack, to >> prevent some race conditions with our $wor

Re: NFS Mount Hangs

2021-03-19 Thread Rick Macklem
Jason Breitman wrote: >Thank you for your focus on the issue I am having and I look forward to seeing >your >patch ported to FreeBSD 12.X. I'll only be committing the patch I am convinced it actually fixes something. I'll be looking more closely at it and seeing what mav@ thinks aboutm it. >I als

AW: NFS Mount Hangs

2021-03-19 Thread Scheffenegger, Richard
Sorry, I though this was a problem on stable/13. This is only in HEAD, stable/13 and 13.0 - never MFC'd to stable/12 or backported to 12.1 > I did some reshuffling of socket-upcalls recently in the TCP stack, to > prevent some race conditions with our $work in-kernel NFS server > implementatio

Re: NFS Mount Hangs

2021-03-19 Thread tuexen
be impacted by this. > > Richard Scheffenegger > > > -Ursprüngliche Nachricht- > Von: owner-freebsd-...@freebsd.org Im Auftrag > von Rick Macklem > Gesendet: Freitag, 19. März 2021 16:58 > An: tue...@freebsd.org > Cc: Scheffenegger, Richard ; > freebsd-net@fr

Re: NFS Mount Hangs

2021-03-19 Thread Jason Breitman
Thank you for your focus on the issue I am having and I look forward to seeing your patch ported to FreeBSD 12.X. I also appreciate that you understand the difficulties in testing changes on a core piece of infrastructure. I will let the group know if the issue occurs following the change that

AW: NFS Mount Hangs

2021-03-19 Thread Scheffenegger, Richard
Mount Hangs NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Michael Tuexen wrote: >> On 18. Mar 2021, at 21:55, Rick Macklem wrote: >> >> Michael Tuexen wrote: >>>

Re: NFS Mount Hangs

2021-03-19 Thread Rick Macklem
Michael Tuexen wrote: >> On 18. Mar 2021, at 21:55, Rick Macklem wrote: >> >> Michael Tuexen wrote: On 18. Mar 2021, at 13:42, Scheffenegger, Richard wrote: >> Output from the NFS Client when the issue occurs # netstat -an | grep >> NFS.Server.IP.X >> tcp0

Re: NFS Mount Hangs

2021-03-18 Thread tuexen
> On 18. Mar 2021, at 21:55, Rick Macklem wrote: > > Michael Tuexen wrote: >>> On 18. Mar 2021, at 13:42, Scheffenegger, Richard >>> wrote: >>> > Output from the NFS Client when the issue occurs # netstat -an | grep > NFS.Server.IP.X > tcp0 0 NFS.Client.IP.X:46896

Re: NFS Mount Hangs

2021-03-18 Thread Rick Macklem
Michael Tuexen wrote: >> On 18. Mar 2021, at 13:42, Scheffenegger, Richard >> wrote: >> Output from the NFS Client when the issue occurs # netstat -an | grep NFS.Server.IP.X tcp0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 FIN_WAIT2 >>> I'm no TCP guy

Re: AW: NFS Mount Hangs

2021-03-18 Thread Rodney W. Grimes
> >>Output from the NFS Client when the issue occurs # netstat -an | grep > >>NFS.Server.IP.X > >>tcp0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 > >>FIN_WAIT2 > >I'm no TCP guy. Hopefully others might know why the client would be stuck in > >FIN_WAIT2 (I vaguely recall

Re: NFS Mount Hangs

2021-03-18 Thread Jason Breitman
The laggproto is lacp and the switch is made by Extreme Networks. Jason Breitman On Mar 18, 2021, at 4:06 AM, Gerrit Kuehn wrote: On Wed, 17 Mar 2021 18:17:14 -0400 Jason Breitman wrote: > I will look into disabling the TSO and LRO options and let the group > know how it goes. Below are the

Re: NFS Mount Hangs

2021-03-18 Thread tuexen
> On 18. Mar 2021, at 13:53, Rodney W. Grimes > wrote: > > Note I am NOT a TCP expert, but know enough about it to add a comment... > >> Alan Somers wrote: >> [stuff snipped] >>> Is the 128K limit related to MAXPHYS? If so, it should be greater in 13.0. >> For the client, yes. For the server,

Re: NFS Mount Hangs

2021-03-18 Thread Michael Tuexen
> On 18. Mar 2021, at 13:42, Scheffenegger, Richard > wrote: > >>> Output from the NFS Client when the issue occurs # netstat -an | grep >>> NFS.Server.IP.X >>> tcp0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 >>> FIN_WAIT2 >> I'm no TCP guy. Hopefully others might kno

Re: NFS Mount Hangs

2021-03-18 Thread Rodney W. Grimes
Note I am NOT a TCP expert, but know enough about it to add a comment... > Alan Somers wrote: > [stuff snipped] > >Is the 128K limit related to MAXPHYS? If so, it should be greater in 13.0. > For the client, yes. For the server, no. > For the server, it is just a compile time constant NFS_SRVMAXI

AW: NFS Mount Hangs

2021-03-18 Thread Scheffenegger, Richard
>>Output from the NFS Client when the issue occurs # netstat -an | grep >>NFS.Server.IP.X >>tcp0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 >>FIN_WAIT2 >I'm no TCP guy. Hopefully others might know why the client would be stuck in >FIN_WAIT2 (I vaguely recall this means

Re: NFS Mount Hangs

2021-03-18 Thread Gerrit Kuehn
On Wed, 17 Mar 2021 18:17:14 -0400 Jason Breitman wrote: > I will look into disabling the TSO and LRO options and let the group > know how it goes. Below are the current options on the NFS Server. > lagg0: flags=8943 > metric 0 mtu 1500 > options=e507bb What laggproto are you using, and what k

Re: NFS Mount Hangs

2021-03-17 Thread Jason Breitman
We are using the Intel Ethernet Network Adapter X722. Jason Breitman On Mar 17, 2021, at 6:48 PM, Peter Eriksson wrote: CLOSE_WAIT on the server side usually indicates that the kernel has sent the ACK to the clients FIN (start of a shutdown) packet but hasn’t sent it’s own FIN packet - somet

Re: NFS Mount Hangs

2021-03-17 Thread Peter Eriksson
CLOSE_WAIT on the server side usually indicates that the kernel has sent the ACK to the clients FIN (start of a shutdown) packet but hasn’t sent it’s own FIN packet - something that usually happens when the server has read all data queued up from the client and taken what actions it need to shut

Re: NFS Mount Hangs

2021-03-17 Thread Jason Breitman
Thank you for the responses. The NFS Client does properly negotiate down to 128K for the rsize and wsize. The client port should be changing as we are using the noresvport option. On the NFS Client cat /proc/mounts nfs-server.domain.com:/data /mnt/data nfs4 rw,relatime,vers=4.1,rsize=131072,wsiz

Re: NFS Mount Hangs

2021-03-17 Thread Rick Macklem
Alan Somers wrote: [stuff snipped] >Is the 128K limit related to MAXPHYS? If so, it should be greater in 13.0. For the client, yes. For the server, no. For the server, it is just a compile time constant NFS_SRVMAXIO. It's mainly related to the fact that I haven't gotten around to testing larger s

Re: NFS Mount Hangs

2021-03-17 Thread Alan Somers
On Wed, Mar 17, 2021 at 3:37 PM Rick Macklem wrote: > Jason Breitman wrote: > >Please review the details below and let me know if there is a setting > that I should >apply to my FreeBSD NFS Server or if there is a bug fix that > I can apply to resolve my >issue. > >I shared this information with

Re: NFS Mount Hangs

2021-03-17 Thread Rick Macklem
Jason Breitman wrote: >Please review the details below and let me know if there is a setting that I >should >apply to my FreeBSD NFS Server or if there is a bug fix that I can >apply to resolve my >issue. >I shared this information with the linux-nfs mailing list and they believe the >issue is >

NFS Mount Hangs

2021-03-17 Thread Jason Breitman
Please review the details below and let me know if there is a setting that I should apply to my FreeBSD NFS Server or if there is a bug fix that I can apply to resolve my issue. I shared this information with the linux-nfs mailing list and they believe the issue is on the server side. Issue NFS