On Fri, Apr 17, 2020 at 12:45:36PM +0200, Christian Schoenebeck wrote: > On Donnerstag, 16. April 2020 20:47:11 CEST Omar Sandoval wrote: > > > > Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., > > > > network filesystems. open(2) notes that O_NOATIME "may not be effective > > > > on all filesystems. One example is NFS, where the server maintains the > > > > access time." This means that we can honor it when possible but fall > > > > back to ignoring it. > > > > > > I am not sure whether NFS would simply silently ignore O_NOATIME i.e. > > > without returning EPERM. I don't read it that way. > > > > As far as I can tell, the NFS protocol has nothing equivalent to > > O_NOATIME and thus can't honor it. Feel free to test it: > > I did not doubt that NFS does not support O_NOATIME, what I said was that I > thought using O_NOATIME on NFS would return EPERM, but ... > > > # mount -t nfs -o vers=4,rw 10.0.2.2:/ /mnt > > # echo foo > /mnt/foo > > # touch -d "1 hour ago" /mnt/foo > > # stat /mnt/foo | grep 'Access: [0-9]' > > Access: 2020-04-16 10:43:36.838952593 -0700 > > # # Drop caches so we have to go to the NFS server. > > # echo 3 > /proc/sys/vm/drop_caches > > # strace -e openat dd if=/mnt/foo of=/dev/null iflag=noatime |& grep > > /mnt/foo openat(AT_FDCWD, "/mnt/foo", O_RDONLY|O_NOATIME) = 3 > > # stat /mnt/foo | grep 'Access: [0-9]' > > Access: 2020-04-16 11:43:36.906462928 -0700 > > ... I tried this as well, and you are right, O_NOATIME is indeed simply > *silently* dropped/ignored by NFS (i.e. without raising any error). > > > > It would certainly fix the problem in your use case. But it would also > > > unmask O_NOATIME for all other ones (i.e. regular users on guest). > > > > The guest kernel will still check whether processes on the guest have > > permission to use O_NOATIME. This only changes the behavior when the > > guest kernel believes that the process has permission even though the > > host QEMU process doesn't. > > Good point! > > > > I mean I understand your point, but I also have to take other use cases > > > into account which might expect to receive EPERM if O_NOATIME cannot be > > > granted. > > If you'd still like to preserve this behavior, would it be acceptable to > > make this a QEMU option? Maybe something like "-virtfs > > honor_noatime=no": the default would be "yes", which is the current > > behavior, and "no" would always mask out NOATIME. > > That was my initial tendency yesterday, but your arguments now convinced me > that the implied re-run, in the way your provided patch already does, is in > this particular case the better choice. > > Making that a configurable option would render this issue unnecessarily > complicated and probably even contra productive for other users which might > stumble over the same issue. > > Just do me a favour: you already thoroughly explained the issue in the commit > log, that's good, but please also add a short comment in code why the rerun > is > required, because it is not obvious by just reading the code. Finding that > info from git log becomes tedious as soon as code is refactored there. > > Aside of the missing comment in code: > > Acked-by: Christian Schoenebeck <qemu_...@crudebyte.com>
Thank you! I'll add the comment and resend.