We are using the Intel Ethernet Network Adapter X722.

Jason Breitman


On Mar 17, 2021, at 6:48 PM, Peter Eriksson <p...@lysator.liu.se> wrote:

CLOSE_WAIT on the server side usually indicates that the kernel has sent the 
ACK to the clients FIN (start of a shutdown) packet but hasn’t sent it’s own 
FIN packet - something that usually happens when the server has read all data 
queued up from the client and taken what actions it need to shutdown down it’s 
service…

Here’s a fine ASCII art. Probably needs to be viewed using a monospaced font :-)

Client
> ESTABLISHED --> FIN-WAIT-1   +-----> FIN-WAIT-2   +-----> TIME-WAIT ---> 
> CLOSED
>                     :        ^                    ^           :
>                 FIN :        : ACK            FIN :       ACK :
>                     v        :                    :           v
> ESTABLISHED         +--> CLOSE-WAIT --....---> LAST-ACK       +--------> 
> CLOSED
Server


TSO/LRO and/or “intelligence” in some smart network cards can cause all kinds 
of interesting bugs. What ethernet cards are you using?
(TSO/LRO seems to be working better these days for our Intel X710 cards, but a 
couple of years ago they would freeze up on us so we had to disable it)

Hmm.. Perhaps the NFS server is waiting for some locks to be released before it 
can close down it’s end of the TCP link? Reservations? 

But I’d suspect something else since we’ve been running NFSv4.1/Kerberos on our 
FreeBSD 11.3/12.2 servers for a long time with many Linux clients and most 
issues (the last couple of years) we’ve seen have been on the Linux end of 
things… Like the bugs in the Linux gss daemons or their single-threaded mount() 
sys call, or automounter freezing up... and other fun bugs.

- Peter

> On 17 Mar 2021, at 23:17, Jason Breitman <jbreit...@tildenparkcapital.com> 
> wrote:
> 
> Thank you for the responses.
> The NFS Client does properly negotiate down to 128K for the rsize and wsize.
> 
> The client port should be changing as we are using the noresvport option.
> 
> On the NFS Client
> cat /proc/mounts
> nfs-server.domain.com:/data /mnt/data nfs4 
> rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,noresvport,proto=tcp,timeo=600,retrans=2,sec=krb5,clientaddr=NFS.Client.IP.X,lookupcache=pos,local_lock=none,addr=NFS.Server.IP.X
>  0 0
> 
> When the issue occurs, this is what I see on the NFS Server.
> tcp4       0      0 NFS.Server.IP.X.2049      NFS.Client.IP.X.51550     
> CLOSE_WAIT  
> 
> Capturing packets right before the issue is a great idea, but I am concerned 
> about running tcpdump for such an extended period of time on an active server.
> I have gone 9 days with no issue which would be a lot of data and overhead.
> 
> I will look into disabling the TSO and LRO options and let the group know how 
> it goes.
> Below are the current options on the NFS Server.
> lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 
> mtu 1500
>       
> options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
> 
> Please share other ideas if you have them.
> 
> Jason Breitman
> 
> 
> On Mar 17, 2021, at 5:58 PM, Rick Macklem <rmack...@uoguelph.ca> wrote:
> 
> Alan Somers wrote:
> [stuff snipped]
>> Is the 128K limit related to MAXPHYS? If so, it should be greater in 13.0.
> For the client, yes. For the server, no.
> For the server, it is just a compile time constant NFS_SRVMAXIO.
> 
> It's mainly related to the fact that I haven't gotten around to testing larger
> sizes yet.
> - kern.ipc.maxsockbuf needs to be several times the limit, which means it 
> would
> have to increase for 1Mbyte.
> - The session code must negotiate a maximum RPC size > 1 Mbyte.
> (I think the server code does do this, but it needs to be tested.)
> And, yes, the client is limited to MAXPHYS.
> 
> Doing this is on my todo list, rick
> 
> The client should acquire the attributes that indicate that and set 
> rsize/wsize
> to that. "# nfsstat -m" on the client should show you what the client
> is actually using. If it is larger than 128K, set both rsize and wsize to 
> 128K.
> 
>> Output from the NFS Client when the issue occurs
>> # netstat -an | grep NFS.Server.IP.X
>> tcp 0 0 NFS.Client.IP.X:46896 NFS.Server.IP.X:2049 FIN_WAIT2
> I'm no TCP guy. Hopefully others might know why the client would be
> stuck in FIN_WAIT2 (I vaguely recall this means it is waiting for a fin/ack,
> but could be wrong?)
> 
>> # cat /sys/kernel/debug/sunrpc/rpc_xprt/*/info
>> netid: tcp
>> addr: NFS.Server.IP.X
>> port: 2049
>> state: 0x51
>> 
>> syslog
>> Mar 4 10:29:27 hostname kernel: [437414.131978] -pid- flgs status -client- 
>> --rqstp- ->timeout ---ops--
>> Mar 4 10:29:27 hostname kernel: [437414.133158] 57419 40a1 0 9b723c73 
>> >143cfadf 30000 4ca953b5 nfsv4 OPEN_NOATTR a:call_connect_status [sunrpc] 
>> >q:xprt_pending
> I don't know what OPEN_NOATTR means, but I assume it is some variant
> of NFSv4 Open operation.
> [stuff snipped]
>> Mar 4 10:29:30 hostname kernel: [437417.110517] RPC: 57419 
>> xprt_connect_status: >connect attempt timed out
>> Mar 4 10:29:30 hostname kernel: [437417.112172] RPC: 57419 
>> call_connect_status
>> (status -110)
> I have no idea what status -110 means?
>> Mar 4 10:29:30 hostname kernel: [437417.113337] RPC: 57419 call_timeout 
>> (major)
>> Mar 4 10:29:30 hostname kernel: [437417.114385] RPC: 57419 call_bind (status 
>> 0)
>> Mar 4 10:29:30 hostname kernel: [437417.115402] RPC: 57419 call_connect xprt 
>> >00000000e061831b is not connected
>> Mar 4 10:29:30 hostname kernel: [437417.116547] RPC: 57419 xprt_connect xprt 
>> >00000000e061831b is not connected
>> Mar 4 10:30:31 hostname kernel: [437478.551090] RPC: 57419 
>> xprt_connect_status: >connect attempt timed out
>> Mar 4 10:30:31 hostname kernel: [437478.552396] RPC: 57419 
>> call_connect_status >(status -110)
>> Mar 4 10:30:31 hostname kernel: [437478.553417] RPC: 57419 call_timeout 
>> (minor)
>> Mar 4 10:30:31 hostname kernel: [437478.554327] RPC: 57419 call_bind (status 
>> 0)
>> Mar 4 10:30:31 hostname kernel: [437478.555220] RPC: 57419 call_connect xprt 
>> >00000000e061831b is not connected
>> Mar 4 10:30:31 hostname kernel: [437478.556254] RPC: 57419 xprt_connect xprt 
>> >00000000e061831b is not connected
> Is it possible that the client is trying to (re)connect using the same client 
> port#?
> I would normally expect the client to create a new TCP connection using a
> different client port# and then retry the outstanding RPCs.
> --> Capturing packets when this happens would show us what is going on.
> 
> If there is a problem on the FreeBSD end, it is most likely a broken
> network device driver.
> --> Try disabling TSO , LRO.
> --> Try a different driver for the net hardware on the server.
> --> Try a different net chip on the server.
> If you can capture packets when (not after) the hang
> occurs, then you can look at them in wireshark and see
> what is actually happening. (Ideally on both client and
> server, to check that your network hasn't dropped anything.)
> --> I know, if the hangs aren't easily reproducible, this isn't
> easily done.
> --> Try a newer Linux kernel and see if the problem persists.
> The Linux folk will get more interested if you can reproduce
> the problem on 5.12. (Recent bakeathon testing of the 5.12
> kernel against the FreeBSD server did not find any issues.)
> 
> Hopefully the network folk have some insight w.r.t. why
> the TCP connection is sitting in FIN_WAIT2.
> 
> rick
> 
> 
> 
> Jason Breitman
> 
> 
> 
> 
> 
> 
> _______________________________________________
> freebsd-net@freebsd.org<mailto:freebsd-net@freebsd.org> mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to 
> "freebsd-net-unsubscr...@freebsd.org<mailto:freebsd-net-unsubscr...@freebsd.org>"
> 
> _______________________________________________
> freebsd-net@freebsd.org<mailto:freebsd-net@freebsd.org> mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to 
> "freebsd-net-unsubscr...@freebsd.org<mailto:freebsd-net-unsubscr...@freebsd.org>"
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to