Hi Ludo, Sorry for the delay. I'm currently very busy. Small summary follows:
>That’s on the bare metal, right? It does look like the file system was >indeed in a bad state and that we’re just confirming it? Yes, but it would have stayed in the bad state forever because our e2fsck was too old. I think what happened (preliminary... but should be close enough): There is code in ext4 that handles "orphan" inodes. Those "orphan" inodes are inodes that no directory entry points to anymore. For example if you open a file "A" and then unlink it and keep it open, the corresponding inode becomes an orphan, but cannot be GCed yet (since you are using it). There is code in ext4 that eventually, when everyone closed the orphan, gets rid of it entirely, freeing the payload extents. However, for the (later) case if the computer crashes, ext4 also remembers the set of orphans somewhere ON DISK. That's so it can find the orphans later (after booting again) and free them. In a recent ext4 update in the kernel, the orphan handling grew a new option, I think on by default, that stores this set of orphans in a regular file (instead of in some weird metadata in the superblock as it did before). Now if you remount a filesystem readonly, the kernel cannot (well, should not) actually update the contents of that orphan file (since it's a regular file and you said "read-only" :P), so it can happen that the set of orphans is incorrect. In such a case, the kernel sets the filesystem as dirty (earlier, it would just fail the remount ro--but we didn't see that anymore). Now (old) e2fsck will come and see some weird floating inodes but it doesn't know what they are so it leaves them alone. It clears the damage flag. We eventually make some more orphans, we mount fs ro, there we go, endless filesystem corruption loop. Anyway, after the updated e2fsck the machine has no problems anymore (so far...). Docs: <https://www.kernel.org/doc/Documentation/filesystems/ext4/orphan.rst> Kernel bugfix: <https://lore.kernel.org/all/20240611142704.14307-1-luis.henriq...@linux.dev/>