On Thu, 23 Jan 2025 00:02:37 +0100 Francesco Poli wrote: [...] > Hello Salvatore, > thanks for following up. [...]
By the way, I am also experiencing a huge performance hit on the I/O through the NFS shares. Please let me explain. On the host where the NFS server runs (let's call it "$server"), there is the following '/etc/exports' file: $ grep '^[^#]' /etc/exports /home/ 172.16.0.0/22(rw,sync,no_subtree_check,no_root_squash) /opt/ 172.16.0.0/22(ro,sync,no_subtree_check,no_root_squash) Please note that 172.16.0.0/22 is the InfiniBand (local) network. The '/etc/nfs.conf' file has already been summarized (by reportbug) in the original bug report. On the hosts where the NFS clients run (let's call them "$client"), the home NFS share is mounted on /home with the following options (in the '/etc/fstab' file): nfs nofail,nfsvers=3,rdma,port=20049,exec,dev,suid,rw,bg,rsize=32768,wsize=32768,intr and the same (but with 'ro' in stead of 'rw') for /opt Well, as of Fri, 17 Jan 2025 (before the upgrade that failed to complete), I could take a 326 MB binary file and copy it to another file within the same directory under /home with: $ dd if=test.dat of=new.dat status=progress $ rm new.dat On $server (where /home is a local filesystem, on 6 mechanical hard disks in software RAID6) the result was: 326091584 bytes (326 MB, 311 MiB) copied, 1.08429 s, 301 MB/s On each of the $client boxes (where /home is a mounted NFS share through the InfiniBand network, protocol RDMA, as I have previously said), the results were: 326091584 bytes (326 MB, 311 MiB) copied, 2.54522 s, 128 MB/s 326091584 bytes (326 MB, 311 MiB) copied, 2.64063 s, 123 MB/s 326091584 bytes (326 MB, 311 MiB) copied, 2.46292 s, 132 MB/s [...] That was not like reading and writing locally, but maybe we can accept a 2.3 or 2.4 factor for the copying time... Now, as of Tue, 21 Jan 2025, after the upgrade that failed to complete on $server, a reboot of $server , the completing of the upgrade (and an upgrade/reboot of some of the $client boxes, it seems to make no or very little difference), the results on the $client boxes are: 326091584 bytes (326 MB, 311 MiB) copied, 203.068 s, 1.6 MB/s 326091584 bytes (326 MB, 311 MiB) copied, 195.6 s, 1.7 MB/s 326091584 bytes (326 MB, 311 MiB) copied, 139.393 s, 2.3 MB/s 326091584 bytes (326 MB, 311 MiB) copied, 207.157 s, 1.6 MB/s [...] The factor for the copying time is now in the range 128÷207 (a slowdown of 52÷84 , compared to before the upgrade)... I am not sure whether this performance hit is caused by the same bug that prevented the nfs-kernel-server service from starting, or by another issue. Do you need me to file a separate bug report for the performance hit? Thanks for any help you may provide! -- http://www.inventati.org/frx/ There's not a second to spare! To the laboratory! ..................................................... Francesco Poli . GnuPG key fpr == CA01 1147 9CD2 EFDF FB82 3925 3E1C 27E1 1F69 BFFE
pgpvQunizqUe6.pgp
Description: PGP signature