On Wed, Jan 1, 2025 at 1:20 PM Kenneth Marshall <k...@rice.edu> wrote: > On Tue, Dec 31, 2024 at 06:58:14PM -0500, Tom Lane wrote: > > Larry Rosenman <l...@lerctr.org> writes: > > > On 12/31/2024 5:37 pm, Tom Lane wrote: > > >> Do you know what its underlying file system is? > > > > > btrfs
> Maybe there are some btrfs or nfs options that can be used to mitigate > this effect. Otherwise, a bug report to Debian would be in order, I guess. Mount option readdirsize on the client side should hide the problem up to some size you choose, but you can't set it large enough for high numbers of relations/forks/segments. Guessing what is happening here: I suspect BTRFS might have positional offsets 1, 2, 3, ... for directory entries' d_off (the value visible in struct direct, used for telldir(), seekdir(), and NFS's behind-the-curtain paging scheme), and they might slide when you unlink stuff. Perhaps not immediately, but when the directory fd is closed on the NFS server (nearly immediately I guess given the stateless nature of NFS, it doesn't matter that the client has its directory fd open). That would explain how you finished up with so many missed files. I think XFS's d_off points to the next entry in a btree leaf page scan, which sounds a lot more stable... until someone else unlinks the next item underneath you and/or the system decides to compact stuff, who knows... And other systems have other schemes based on hashes or raw offsets, with different degrees of stability and anomalies (cf ELOOP for hash collisions). NFS is at least supposed to tell the client that its cookie has been invalidated with a cookie-invalidation-cookie called cookieverf. But there isn't any specified way to recover. FreeBSD's client looks like it might try to, but I'm not sure if that Linux's server even implements it. Anyway, I'll write a patch to change rmtree() to buffer the names in memory. In theory there could be hundreds of gigabytes of pathnames, so perhaps I should do it in batches; I'll look into that.