On Jun 22, 2022, Jonathan Wakely <jwak...@redhat.com> wrote:

> It looks like it would be possible for this to livelock.

True

> The current
> implementation will fail with an error in that case, rather than
> getting stuck forever in a loop.

In the equivalent livelock scenario, newly-added dir entries are added
to the end of the directory, and get visited in the same readdir
iteration, so you never reach the end as long as the entry-creator runs
faster than the remover.

>     It would be possible to add a __rewind member to directory iterators
>     too, to call rewinddir after each modification to the directory.

That would likely lead to O(n^2) behavior in some implementations, in
which remove entries get rescanned over and over, whereas the approach I
proposed cuts that down to O(nlogn).  Unless you rewind once you hit the
end after successful removals, even before trying to remove the dir.
That's still a little wasteful on POSIX-compliant targets, because you
rewind and rescan a dir that should already be empty.  I aimed for
minimizing the overhead on compliant targets, but that was before
remove_all switched to recursive iterators.

With recursive iterators, rewinding might be better done in a custom
iterator, tuned for recursive removal, that knows to rewind a dir if
there were removals in it or something.  Rewinding the entire recursive
removal is not something I realized my rewritten patch did.  I still had
the recursive remove_all implementation in cache.  Oops ;-)

That said, I'm not sure it makes much of a difference in the end.  As
long as the recursion and removal doesn't treat symlinks as dirs (which
IIUC requires openat and unlinkat, so that's a big if to boot), the
rewinding seems to only change the nature of filesystem race conditions
that recursive removal enables.  E.g., consider that you start removing
the entries in a dir, and then the dir you're visiting is moved out of
the subtree you're removing, and other dirs are moved into it: the
recursive removal with openat and unlinkat will happily attempt to wipe
out everything moved in there, even if it wasn't within that subtree at
the time of the remove_all request, and even if the newly-moved dirs
were never part of the subtree whose removal was requested.  To make it
clearer:

  c++::std::filesystem::remove_all foo/bar &
  mv foo/bar/temp temp
  mv foo temp
  wait
  ls -d temp/foo

temp/foo might be removed if you happened to be iterating in temp when
the 'mv' commands run.  Is that another kind of race that needs to be
considered?  If a dir is moved while we're visiting it, should we stop
visiting it?  What about its parent?

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

Reply via email to