On Thu, Dec 26, 2024 at 01:19:34PM -0500, Michael Stone wrote:
> Further reading: look at the auto_da_alloc option in ext4. Note that it says
> that doing the rename without the sync is wrong, but there's now a heuristic
> in ext4 that tries to insert an implicit sync when that anti-pattern is used
> (because so much data got eaten when people did the wrong thing). By leaning
> on that band-aid dpkg might get away with skipping the sync, but doing so
> would require assuming a filesystem for which that implicit guarantee is
> available. If you're on a different filesystem or a different kernel all
> bets would be off. I don't know how much difference skipping the fsync's
> makes these days if they get done implicitly.

Note that it's not a sync, but rather, under certain circumstances, we
initiate writeback --- but we don't wait for it to complete before
allowing the close(2) or rename(2) to complete.  For close(2), we will
initiate a writeback on a close if the file descriptor was opened
using O_TRUNC and truncate took place to throw away the previous
contents of the file.  For rename(2), if you rename on top of a
previously existing file, we will initiate the writeback right away.
This was a tradeoff between safety and performance, and this was done
because there was an awful lot of buggy applications out there which
didn't use fsync, and the number of application programmers greatly
outnumbered the file system programmer.  This was a compromise that
was discussed at a Linux Storage, File Systems, and Memory Management
(LSF/MM) conference many years ago, and I think other file systems
like btrfs and xfs had agreed in principle that this was a good thing
to do --- but I can't speak to whether they actually implemented it.

It's very likely though that file systems that didn't exist at that
time frame, or by programmers who care a lot more about absolute
performance than say, usability in real world circumstances, wouldn't
have implemented this workaround.  And so both between the fact that
it's not perfect (it narrows the window of vulnerability from 30
seconds to a fraction of a second, but it's certainly not perfect) and
the fact that not all file systems will implement this (I'd be shocked
if bcachefs had this feature), are both good reasons not to depend on
it.  Of course, if you use crappy applications, you as a user may very
well be depending on it without knowing it --- which is *why*
auto_da_alloc exists.  :-)

That being said, there are things you could do to speed up dpkg which
are both 100% safe; the trade-off, as always is implementation
complexity.  (The reason why many application programs opened with
O_TRUNC and rewrote a file was so they wouldn't have to copy over
extended attributes and Posix ACL's, because that was Too Hard and Too
Complicated.)  So what what dpkg could do is whenever there is a file
that dpkg would need to overwrite, to write it out to
"filename.dpkg-new-$pid" and keep a list of all the files.  After all
of the files are written out, call syncfs(2) --- on Linux, syncfs(2)
is synchronous, although POSIX does not guarantee that the writes will
be written and stable at the time that syncfs(2) returns.  But that
should be OK, since Debian GNU/kFreeBSD is no longer a thing.  Only
after syncfs(2) returns, do you rename all of the dpkg-new files to
the final location on disk.

This is much faster, since you're not calling fsync(2) for each file,
but only forcing a file system commit operation just once.  The cost
is more implementation complexity in dpkg.  I'll let other people
decide how to trade off implemetation complexity, performance, and
safety.   

Cheers,

                                                - Ted

Reply via email to