** Summary changed: - NFS connections block while causing a high-bandwidth RPC-pingpong between client and server + NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong between client and server
** Description changed: - There's a bug in kernels before Linux 5.0 that affects NFS 4.1 connections. The bug presents itself like this: - * On NFS clients: Attempts to access mounted NFS shares associated with the affected server - block indefinitely. - * On the network: A storm of repeated RPCs between NFS client and server uses a lot - of bandwidth. Each RPC is acknoledged by the server with an NFS4ERR_SEQ_MISORDERED error. - * Other NFS clients connected to the same NFS server: Performance drops dramatically. + BugLink: https://bugs.launchpad.net/bugs/1828978 - A patch is available to fix this problem: + [Impact] - <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3453d5708b33efe76f40eca1c0ed60923094b971> + There is a bug in NFS v4.1 that causes a large amount of RPC calls + between a client and server when a previous RPC call is interrupted. + This uses a large amount of bandwidth and can saturate the network. - Is is possible to integrate the patch into the 4.18 kernel series? - I'm using Ubuntu 18.04.2 LTS as NFS client an server. + The symptoms are so: - Thank you. + * On NFS clients: + Attempts to access mounted NFS shares associated with the affected server block indefinitely. + + * On the network: + A storm of repeated RPCs between NFS client and server uses a lot of bandwidth. Each RPC is acknoledged by the server with an NFS4ERR_SEQ_MISORDERED error. - Best regards, + * Other NFS clients connected to the same NFS server: + Performance drops dramatically. - Frank Burkhardt + This occurs during a "false retry", when a client attempts to make a new + RPC call using a slot+sequence number that references an older, cached + call. This happens when a user process interrupts an RPC call that is in + progress. + + [Fix] + + This was fixed in 5.1 upstream with the below commit: + + commit 3453d5708b33efe76f40eca1c0ed60923094b971 + Author: Trond Myklebust <trond.mykleb...@hammerspace.com> + Date: Wed Jun 20 17:53:34 2018 -0400 + Subject: NFSv4.1: Avoid false retries when RPC calls are interrupted + + The fix is to pre-emptively increment the sequence number if an RPC call + is interrupted, and to address corner cases we interpret the + NFS4ERR_SEQ_MISORDERED error as a sign we need to locate an approperiate + sequence number between the value we sent, and the last successfully + acked SEQUENCE call. + + Commit 3453d5708b33efe76f40eca1c0ed60923094b971 is a clean cherry-pick + to disco. + + [Testcase] + + This is difficult to reproduce on test systems, and has instead been + verified on a production NFS v4.1 system in a customer environment. This + server is heavily trafficked and has a large number of different NFS + clients connected to it. + + I have built a test kernel that contains the above patch, and also + patches for Bug 1842037. It is available here: + + https://launchpad.net/~mruffell/+archive/ubuntu/sf241068-test + + Note that the above kernel is for bionic HWE, and not explicitly disco. + + Discussion about the patch validation can be found at the bottom of Bug + 1842037. + + On unpatched kernels, expect to see the symptoms mentioned in Impact, + and on patched systems, everything working as intended. + + [Regression Potential] + + The changes are localised to NFS v4.1 only, and other versions of NFS + are not affected. If a regression occurs, users can downgrade NFS + versions to v4.0 or v3.x until a fix is made. + + The changes only impact when connections are interrupted, and under + typical blue sky scenarios would not be invoked. + + There have been no fixup commits or commits near the requested commit in + newer kernels, which points to this commit fixing the issue, and adopted + by the community. ** Tags added: disco sts -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1828978 Title: NFSv4.1: Interrupted connections cause high bandwidth RPC ping-pong between client and server To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1828978/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs