Hi, I recently had my server's filesystem implode, and I'm currently in the process of cleaning it up. It had widespread corruption in files and directories scattered across the filesystem, though all vaguely recently changed. Directories appeared corrupted or truncated, various files showed up as piles of NULs, and 5000+ files and directories ended up in lost+found. I observed this corruption shortly after a reboot into 4.0.2 (from a previous kernel of 3.16), with ext4 noticing an inconsistency and mounting the filesystem read-only. The underling disks had no errors.
Reading about the corruption issue fixed by d2dc317d564a46dfc683978a2e5a4f91434e9711 ("ext4: fix data corruption caused by unwritten and delayed extents"), it sounds plausible. Can that strike both file data and directory data, assuming all of that data ended up grouped with a delayed extent? Would that bug manifest as corrupted directories and files filled with NULs? The system is a 72-way server on which I was doing piles of parallel git pulls and builds, so hitting a race seems plausible. I'm trying to track down potential causes of this so that I can feel comfortable trusting that system again. Thanks, Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/