August 9, 2025 at 12:27 AM, "Milos Nikic" <nikic.mi...@gmail.com mailto:nikic.mi...@gmail.com?to=%22Milos%20Nikic%22%20%3Cnikic.milos%40gmail.com%3E > wrote:
> > Hi all, > > Since my last email about the journaling subsystem, I’ve done a lot of work > on it. > Here’s what’s new over the past few weeks: > > **Integration & safety** > > Journaling now writes directly to raw disk space *outside* ext2fs-managed > blocks. > No more feedback loops or early boot write limitations. > > journal_init() now reads its configuration (offset, size, etc.) directly from > four reserved fields in the ext2 superblock. > If those fields are unset or invalid, journaling stays off and all hooks are > no-ops. > > Hooks into libdiskfs remain minimal and isolated; core paths are unchanged > when journaling is disabled. > > **Filtering & replay improvements** > > - Added a dedicated policy module to filter out noisy events (/tmp, build > outputs, etc.). > > - Stronger inode fingerprinting to prevent misapplying updates after inode > reuse. > > - Replay is now dual-path: inode-based first, falling back to path-based when > needed. > > - “Best effort” file recreation under /restore/[timestamp] with correct > metadata when files vanish after a crash. > > **Two tricky problems took significant work:** > > 1. > > **Path recovery:** cred->po->path often gives useful file paths, but > sometimes needs sanitizing or is imprecise. Combined with the current name, > it’s often enough to reconstruct missing files. Replay now uses path-based > recovery when inode-based recovery fails. > > 2. > > **Aggressive inode reuse in ext2:** After deletion (say at fsck time, or any > time really) the same inode number may be reassigned to a completely > different file after reboot. Fingerprinting ensures we never apply stale > updates to the wrong file. > > **Testing & results** > > - Survived repeated hard reboots under concurrent create/delete stress. > > - In chaos tests where fsck over-deleted files, journaling replay brought > them back as expected. > > **Other changes** > > - Removed unused async paths, watchers, and threads — code size is still > larger than before, but cleaner. > > - Memory use during replay is controlled via fixed-size arenas. > > **Important scope note** > > This is **not** a replacement for fsck, ext4-style transactions, or a strong > consistency guarantee > > It’s a *best-effort*, *do-no-harm* crash-recovery helper that complements > fsck by restoring metadata and paths opportunistically. > > When disabled or misconfigured, it is inert and has zero impact on normal > operation. > > **Future work ideas** > > - Better path preservation to improve replay accuracy. > > - Per-node timelines for smarter change grouping. > > - Integration with ext tooling to support formatting with journaling fields > and an 8 MiB carve-out. > > - Exporting replay stats via /proc-like interface. > > This patch is large (~4.6 kLOC) but self-contained — most of it is in new > libdiskfs/journal_*.c files. > If preferred, I can break it into a smaller series. > > Let me know your thoughts! > > Thanks, This is awesome! Thanks for the contribution! Is this still the best guide for how to use your best effort journal? https://lists.gnu.org/archive/html/bug-hurd/2025-07/msg00048.html