What additional information can I provide for us to move forward with this process?
To summarize and include further details, rsync is used to sync applications to a file server which behaves like a repository. We do preserve timestamps from the build server and also use --delete. We do not run the applications from the file server. All servers use NTP. The application has a sub-directory that contain files with version numbers. These are libraries. When a new build is complete, a developer pushes their updates via rsync to the file server / repository. I believe that the dentry cache thinks the "old" files exist and generates a No such file or directory error showing question marks for that files attributes. Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches resolves the issue. This behavior is not observed in Debian 10.8 with that distributions associated kernel and packages. > -----Original Message----- > From: Jason Breitman > Sent: Friday, August 19, 2022 9:52 PM > To: Ben Hutchings <b...@decadent.org.uk>; 1017...@bugs.debian.org > Subject: RE: Bug#1017720: nfs-common: No such file or directory > > > -----Original Message----- > > From: Ben Hutchings <b...@decadent.org.uk> > > Sent: Friday, August 19, 2022 7:27 PM > > To: Jason Breitman <jbreit...@tildenparkcapital.com>; > > 1017...@bugs.debian.org > > Subject: Re: Bug#1017720: nfs-common: No such file or directory > > > > Control: tag -1 moreinfo > > > > On Fri, 2022-08-19 at 13:16 +0000, Jason Breitman wrote: > > > Package: nfs-common > > > Version: 1:1.3.4-6 > > > Severity: important > > > > > > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 > > > GNU/Linux > > > > > > -- Description > > > After updating and or creating new files on our file server via > > > rsync, we see many files report the error message below from NFSv4 > > > clients since upgrading from Debian 10.8 to Debian 11.4. > > > Clearing the dentry cache resolves the issue right away. > > > I am not sure that nfs-common is the package to blame, but listed > > > it based on the bug submission recommendations. > > > > The NFS implementation is mostly in the kernel, so probably this issue > > belongs there. But the kernel team is responsible for both packages. > > > > [...] > > > -- Error message > > > ls: cannot access 'filename': No such file or directory > > > -????????? ? ? ? ? ? filename > > [...] > > > > So we know the file's there but can't stat it. I think this means the > > client has cached the handle of the old file of that name, which has > > been deleted. > > > > - Are client and server clocks closely synchronised? If not, that > > needs to be fixed. > > > The clocks are synchronized using NTP. > > > - Are clients likely to read this directory while rsync is running, or > > shortly before? If so, it may help to reduce the attribute caching > > timeout on the client. See the "Directory entry caching" section in > > the nfs(5) manual page. > > > Clients are not likely to read this directory while rsync is running for the > observed cases. That can happen in our environment, but not in this case. > I am using the lookupcache=pos option. I tried noac, but the performance > penalty was too much. Which option are you referring to and what setting > do you recommend testing? > > > I don't know why you're only seeing this after an upgrade of the > > clients, though. I'm not aware that there has been any big change to > > attribute caching. > > > I appreciate you responding to my report and am happy to answer any > questions. > We have multiple monitors and log scrapers to detect "file not found" > exceptions that would let us know if this was happening before. > To share more, I have 2 environments mounting from the same file server. > Each environment has several servers. The issue is only seen in the > environment running Debian 11.4. > I also should have mentioned that the files in question have a version > number appended. filename-1111. When the file is updated via rsync, it is > called filename-1112 and the prior file is removed. The error is about > filename-1111. > I am not sure if this is the proper terminology, but the issue appears to be > the negative dentry cache. > > > Ben. > > > > -- > > Ben Hutchings > > Beware of bugs in the above code; > > I have only proved it correct, not tried it. - Donald Knuth > > Jason Breitman Jason Breitman