On Sun, Sep 15, 2024 at 12:22 AM Jonathan Thornburg <dr.j.thornb...@gmail.com> wrote: > > Does OpenBSD support any file systems with built-in checksums to > (try to) ensure metadata and/or data integrity in the face of "bit rot" > disk (or memory/cpu/USB) errors? I'm not looking for ZFS-style storage > pools or logical volume management, "just" checksums to catch silent > metadata and/or data corruption. > > Softraid 1, 5, or 1C could in theory do this, but with a large space > overhead (a factor of 2 to detect errors, or 3 to correct errors). > And, the current (7.5) man pages don't mention any option to have each > read read all the chunks and verify that they're identical. > > And a related question: I have a pool of ~10 external USB3 backup > disks (all consumer-grade WD or Seagate 2.5" spinning rust, either > 2TB or 4TB capacity each), all currently setup with FFS2 filesystems > on top of softraid crypto (/bioctl -c C/). Each backup is to a single > disk, written with (roughly speaking) > rsync -aHESvv --delete /home/ /mnt/home/ > Each disk thus has slightly different contents depending on how > recently I did a backup to that disk, but the vast majority of the > files (those that haven't changed recently) should be identical > across disks. > > [Before anyone asks: Yes, I regularly rotate some of the disks offsite. > And yes, I regularly restore files "in anger".] > > Each backup disk somewhat more than 1e13 bits, so at an unrecoverable > bit error rate of 1e-14 or 1e-15 for consumer disks there's a non-trivial > chance of a bit error somewhere in my backup pool. > > Thinking about how to detect/correct bit-rot in these backups, it > occurs to me that I could hack up some Perl to walk the filesystem > tree on a mounted backup disk, /stat()/ and read each file, and build > a database of (pathname, inode mtime, checksum) tuples. (I could either > ignore symlinks, or checksum the result of /readlink()/.) Then given > such databases for a bunch of disks, a bit more Perl could read all > the databases, find all the files with matching pathname and inode > mtime (so that the contents should be the same, given that my usage > of /rsync/ preserves /mtime/), look for differing checksums, and for > any differences, majority-vote the checksums to identify which copy > or copies is in error. > > But before I reinvent the wheel, can anyone point me to software > which already does this? Bonus points if the software is already > in ports.
perhaps you would find mtree(8) helpful. > > Thanks, > -- > -- "Jonathan Thornburg [remove -color to reply]" > <dr.j.thornb...@gmail-pink.com> > on the west coast of Canada > "The programmers outside looked from Web 2.0 firm to AI company, and from > AI company to Web 2.0 firm, and from Web 2.0 firm to AI company again; > but already it was impossible to say which was which." > -- /Ars Technica/ comment by /ubercurmudgeon/, 2024-05-09 > > >