On Thu, Dec 26, 2024 at 01:19:34PM -0500, Michael Stone wrote: > Further reading: look at the auto_da_alloc option in ext4. Note that it says > that doing the rename without the sync is wrong, but there's now a heuristic > in ext4 that tries to insert an implicit sync when that anti-pattern is used > (because so much data got eaten when people did the wrong thing). By leaning > on that band-aid dpkg might get away with skipping the sync, but doing so > would require assuming a filesystem for which that implicit guarantee is > available. If you're on a different filesystem or a different kernel all > bets would be off. I don't know how much difference skipping the fsync's > makes these days if they get done implicitly.
Note that it's not a sync, but rather, under certain circumstances, we initiate writeback --- but we don't wait for it to complete before allowing the close(2) or rename(2) to complete. For close(2), we will initiate a writeback on a close if the file descriptor was opened using O_TRUNC and truncate took place to throw away the previous contents of the file. For rename(2), if you rename on top of a previously existing file, we will initiate the writeback right away. This was a tradeoff between safety and performance, and this was done because there was an awful lot of buggy applications out there which didn't use fsync, and the number of application programmers greatly outnumbered the file system programmer. This was a compromise that was discussed at a Linux Storage, File Systems, and Memory Management (LSF/MM) conference many years ago, and I think other file systems like btrfs and xfs had agreed in principle that this was a good thing to do --- but I can't speak to whether they actually implemented it. It's very likely though that file systems that didn't exist at that time frame, or by programmers who care a lot more about absolute performance than say, usability in real world circumstances, wouldn't have implemented this workaround. And so both between the fact that it's not perfect (it narrows the window of vulnerability from 30 seconds to a fraction of a second, but it's certainly not perfect) and the fact that not all file systems will implement this (I'd be shocked if bcachefs had this feature), are both good reasons not to depend on it. Of course, if you use crappy applications, you as a user may very well be depending on it without knowing it --- which is *why* auto_da_alloc exists. :-) That being said, there are things you could do to speed up dpkg which are both 100% safe; the trade-off, as always is implementation complexity. (The reason why many application programs opened with O_TRUNC and rewrote a file was so they wouldn't have to copy over extended attributes and Posix ACL's, because that was Too Hard and Too Complicated.) So what what dpkg could do is whenever there is a file that dpkg would need to overwrite, to write it out to "filename.dpkg-new-$pid" and keep a list of all the files. After all of the files are written out, call syncfs(2) --- on Linux, syncfs(2) is synchronous, although POSIX does not guarantee that the writes will be written and stable at the time that syncfs(2) returns. But that should be OK, since Debian GNU/kFreeBSD is no longer a thing. Only after syncfs(2) returns, do you rename all of the dpkg-new files to the final location on disk. This is much faster, since you're not calling fsync(2) for each file, but only forcing a file system commit operation just once. The cost is more implementation complexity in dpkg. I'll let other people decide how to trade off implemetation complexity, performance, and safety. Cheers, - Ted