Hi!

[ This was long ago, and the following is from recollection from the
  top of my head and some mild «git log» crawling, and while I think
  it's still accurate description of past events, interested people
  can probably sieve through the various long discussions at the time
  in bug reports and mailing lists from references in the FAQ entry,
  which BTW I don't think has been touched since, so might additionally
  be in need of a refresh perhaps, don't know. ]

On Tue, 2024-12-24 at 12:54:28 +0300, Michael Tokarev wrote:
> The no-unsafe-io workaround in dpkg was needed for 2005-era ext2fs
> issues,

The problem showed up with ext4 (not ext2 or ext3), AFAIR when Ubuntu
switched their default filesystem in their installer, and reports
started to come in droves about systems being broken.

For all of its existence (AFAIR) dpkg has performed safe and durable
operations for its own database (not for the database directories),
but it was not doing the same for the installed filesystem. That was
introduced at the time to fix the zero-length file behavior from newer
filesystems.

> where a power-cut in the middle of filesystem metadata
> operation (which dpkg does a lot) might result in in unconsistent
> filesystem state.  This workaround slowed down dpkg operations
> quite significantly (and has been criticised due to that a lot,
> the difference is really significant).

I do think the potential for the zero-length files behavior is a
misfeature of newer filesystems, but I do agree that the fsync()s
are the only way to guarantee the properties dpkg expects from the
filesystem. So I don't consider that a workaround at all.

My main objection was/is with how upstream Linux filesystem
maintainers characterized all this. Where it looked like they were
disparaging userland application writers in general for being
incompetent for no performing such fsync()s, but then when one adds
them, those programs become extremely slow, and then one would need
to start using filesystem or OS specific APIs and rather unnatural
code patterns to regain some semblance of the previous performance.
I don't think this upstream perspective has changed much, given that
the derogatory O_PONIES subject still comes up from time to time.

> The workaround is to issue fsync() after almost every filesystem
> operation, instead of after each transaction as dpkg did before.
> 
> Once again: dpkg has always been doing "safe io", the workaround
> was needed for ext2fs only, - it was the filesystem which was
> broken, not dpkg.

The above also seems quite confused. :) dpkg has always done fsync()
for both its status file and for every in-core status modification
via its own journaled support for it (in the /var/lib/dpkg/updates/
directory).

What was implemented at the time was to add missing fsync()s for
database directories, and fsync()s for the unpacked filesystem objects.

AFAIR:

  * We first implemented that via fsync()s to individual files
    immediately after writing them on unpack, which had acceptable
    performance on filesystems such as ext3 (which I do recall using
    at the time) but was pretty terrible on ext4.
  * Then we reworked the code to defer and batch all the fsync()s for
    a specific package after all the file writes, and before the renames,
    which was a bit better but not great.
  * Then after a while we tried to use a single sync(2) before the
    package file renames, which implied system wide syncs and implied
    terrible performance for unrelated filesystems (such as USB
    drives or network mounts), which got subsequently disabled.
  * Then --force-unsafe-io was added to cope with workloads where the
    safety was not required, or for people who preferred performance
    over safety, on those same new filesystems that required it and
    the option was performance xor safety.
  * Then, after suggestions from Linux filesystem developers we switched
    to initiate asynchronous writebacks immediately after a file unpack
    to not block (via Linux sync_file_range(SYNC_FILE_RANGE_WRITE)),
    and then add a writeback barrier where the previous (disabled)
    sync(2) was (via Linux sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE)),
    so that the subsequent fsync(2) would had already been done by that
    time, and would only imply a synchronization point.
  * Then for non-Linux instead of the SYNC_FILE_RANGE_WRITE, a
    posix_fadvise(POSIX_FADV_DONTNEED) was added.
  * Then after a bit the disabled sync(2) code got removed.

> Today, doing an fsync() really hurts, - with SSDs/flash it reduces
> the lifetime of the storage, for many modern filesystems it is a
> costly operation which bloats the metadata tree significantly,
> resulting in all further operations becomes inefficient.
> 
> How about turning this option - force-unsafe-io - to on by default
> in 2025?  That would be a great present for 2025 New Year! :)

Given that the mail is based on multiple incorrect premises, :) and
that I don't see any tests or data backing up that the fsync()s are
no longer needed for safety in general, I'm going to be extremely
reluctant to even consider disabling them by default on the main
system installation, TBH, and would ask for substantial proof that
this would not damage user systems, and even then I'd probably still
feel rather uneasy about it.

And in fact, AFAIR dpkg is still missing fsync()s for filesystem
directories, which I think might have been the cause of reported
leftover files (shared library specifically) that never got removed
and then caused problems. Still need to prep a testing rig for this
and try to reproduce that with the WIP branch I've got around.


OTOH what I also have queued is to add a new --force-reckless-io, to
suppress all fsync()s (including the ones for the database), which
would be ideal to be used on installers, chroots or containers (or for
people who prefer performance over safety, or have lots of backups and
are aware of the trade-offs :). But that has been kind of blocked on
adding database tainting support, because the filesystem contents can
always be checked via digests or can be reinstalled, but if your
database is messed up it's rather hard to know that. The problem is
that because installers would want to use that option, we'd end up
with tainted end systems which would be wrong. Well, or the taint would
need to be manually removed (making external programs having to reach
for the dpkg database). But the above and --force-unsafe-io _could_ be
easily enabled by default in chroot mode (--root) w/o tainting anything
(I've also got some code to make that possible). And I've got on my
TODO to add integrity tracking for the database so that damage can be
more easily detected, which could perhaps make the tainting less of an
issue.

(So I'm sorry, but it looks like you'll not get your 2025 present. :)

Regards,
Guillem

Reply via email to