Package: nfs-kernel-server Version: 1:2.6.2-4 Package: linux-image-6.1.0-21-amd64 Version: 6.1.90-1
During our tests of Proxmox VE with Debian NFS server as a shared storage we've noticed that nfsd sometimes becomes unresponsive and it's necessary to reboot the server. Probably the same error is reported here: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568 NFS server: * DELL PowerEdge R730xd, 2x 10C XEON E5-2640, Samsung SM863 SSDs, 8 GB RAM * fresh installation of Debian Bookworm * Linux 6.1.0-21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux * connected using 10GE link * nfsd.conf configured with nthreads=16 (also tested with 8 and 4), other options left on defaults * XFS mount exported with options: rw,sync,no_root_squash,no_subtree_check,no_wdelay NFS client: * DELL PowerEdge FC630, 2x 14C Xeon E5-2680 v4, 256 GB RAM * fresh installation of Proxmox VE 8.2 * Proxmox Linux 6.8.4-3-pve kernel * connected using 10GE link * nfs client mount options: rw,noatime,nodiratime,vers=4.2,rsize=1048576,wsize=1048576, namlen=255,hard,proto=tcp,nconnect=8,max_connect=16,timeo=600,retrans=2,sec=sys, clientaddr=10.xx.xx.xx,local_lock=none,addr=10.xx.xx.xx Dmesg on nfsd server side (repeats forever): [ 3142.693181] INFO: task nfsd:1035 blocked for more than 120 seconds. [ 3142.693217] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1 [ 3142.693239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3142.693264] task:nfsd state:D stack:0 pid:1035 ppid:2 flags:0x00004000 [ 3142.693273] Call Trace: [ 3142.693275] <TASK> [ 3142.693279] __schedule+0x34d/0x9e0 [ 3142.693288] schedule+0x5a/0xd0 [ 3142.693294] schedule_timeout+0x118/0x150 [ 3142.693301] wait_for_completion+0x86/0x160 [ 3142.693307] __flush_workqueue+0x152/0x420 [ 3142.693317] nfsd4_destroy_session+0x1b6/0x250 [nfsd] [ 3142.693379] nfsd4_proc_compound+0x355/0x660 [nfsd] [ 3142.693433] nfsd_dispatch+0x1a1/0x2b0 [nfsd] [ 3142.693478] svc_process_common+0x289/0x5e0 [sunrpc] [ 3142.693551] ? svc_recv+0x4e5/0x890 [sunrpc] [ 3142.693631] ? nfsd_svc+0x360/0x360 [nfsd] [ 3142.693676] ? nfsd_shutdown_threads+0x90/0x90 [nfsd] [ 3142.693720] svc_process+0xad/0x100 [sunrpc] [ 3142.693790] nfsd+0xd5/0x190 [nfsd] [ 3142.693836] kthread+0xda/0x100 [ 3142.693843] ? kthread_complete_and_exit+0x20/0x20 [ 3142.693849] ret_from_fork+0x22/0x30 [ 3142.693858] </TASK> Dump of nfsd threads: /proc/1032/stack: [<0>] svc_recv+0x7f3/0x890 [sunrpc] [<0>] nfsd+0xc3/0x190 [nfsd] [<0>] kthread+0xda/0x100 [<0>] ret_from_fork+0x22/0x30 /proc/1033/stack: [<0>] svc_recv+0x7f3/0x890 [sunrpc] [<0>] nfsd+0xc3/0x190 [nfsd] [<0>] kthread+0xda/0x100 [<0>] ret_from_fork+0x22/0x30 /proc/1034/stack: [<0>] svc_recv+0x7f3/0x890 [sunrpc] [<0>] nfsd+0xc3/0x190 [nfsd] [<0>] kthread+0xda/0x100 [<0>] ret_from_fork+0x22/0x30 /proc/1035/stack: [<0>] __flush_workqueue+0x152/0x420 [<0>] nfsd4_destroy_session+0x1b6/0x250 [nfsd] [<0>] nfsd4_proc_compound+0x355/0x660 [nfsd] [<0>] nfsd_dispatch+0x1a1/0x2b0 [nfsd] [<0>] svc_process_common+0x289/0x5e0 [sunrpc] [<0>] svc_process+0xad/0x100 [sunrpc] [<0>] nfsd+0xd5/0x190 [nfsd] [<0>] kthread+0xda/0x100 [<0>] ret_from_fork+0x22/0x30 /proc/130/stack: [<0>] rpc_shutdown_client+0xf2/0x150 [sunrpc] [<0>] nfsd4_process_cb_update+0x4c/0x270 [nfsd] [<0>] nfsd4_run_cb_work+0x9f/0x150 [nfsd] [<0>] process_one_work+0x1c7/0x380 [<0>] worker_thread+0x4d/0x380 [<0>] kthread+0xda/0x100 [<0>] ret_from_fork+0x22/0x30 On NFS client side, there's a number of backchannel reply errors: [78636.676789] RPC: Could not send backchannel reply error: -110 [78647.905675] RPC: Could not send backchannel reply error: -110 [78675.207201] RPC: Could not send backchannel reply error: -110 [78744.201603] RPC: Could not send backchannel reply error: -110 [78784.138769] RPC: Could not send backchannel reply error: -110 We're able to reproduce this bug quite often (several times a day) when restoring a 500GB virtual machine image from Proxmox Backup Server to NFS shared storage. On the other hand, we cannot trigger it by other ways like random and/or sequential I/O fio stress tests. According to iostat, the VM restore job writes to NFS server in 300-400 MiB batches separated by 3-4 secs of inactivity. Interestingly, this issue probably occurs only when using a recent kernel on NFS client side. We're able to hit this bug only with Proxmox Linux 6.8.4-3-pve kernel on NFS client side. When using Proxmox 6.5.13-5-pve kernel there're no client-side backchannel reply errors and nfsd server runs without any hungs. It seems to me that changes in NFS client code between 6.5.x and 6.8.x accidentally uncovered a race in nfsd server code. Based on the bug report #2062568 in Ubuntu I assume this is not a Proxmox-specific issue but Proxmox VM restore workload together with our testing hardware setup makes it easier to hit. Regards, Martin