Hi Ted,
Quoting Theodore Ts'o (2025-06-24 21:04:56)
> On Tue, Jun 24, 2025 at 04:53:46PM +0200, Anton Khirnov wrote:
> > Package: e2fsprogs
> > Version: 1.47.2-1+b1
> > Severity: normal
> > 
> > Dear Maintainer,
> > as a part of my backup system, I create read-only LVM snapshots of ext4
> > partitions. On my newest machine, mounting these snapshots sometimes
> > fails with
> >   fsconfig() failed: Structure needs cleaning.
> > accompanied by
> >   ext4_mark_recovery_complete:6264: comm mount: Orphan file not empty on 
> > read-only fs.
> > in dmesg. Upon investigating, I found out that the orphan_file fs
> > feature, which recently became enabled by default, is the culprit.
> > 
> > This behaviour seems suboptimal to me. Would it be possible to make such
> > filesystems mountable? If not, perhaps this feature should not be on by
> > default, or at least there should be a warning about this.
> 
> Hi Anton,
> 
> When you say read-only snapshots, do you mean that you created with
> some command like this:
> 
> # lvcreate --size 100M --name snap --snapshot --permission r /dev/vg/source

yes, this is what I meant

> I'd need to see your kernel messages to be certain, but this
> requirement of needing to be able to write to the snapshot device is
> there regardless of whether the orphan file feature is enabled or not.
> So I'm not sure how you came to the conclusion that the orphan_file
> feature is "at fault", but I'm pretty sure that's not the case.

I assumed orphan_file is the cause because
1) The kernel error message mentioned it.
2) It is enabled only on the failing machine.

The relevant kernel messages are just these:
[1798344.446943] EXT4-fs (dm-9): write access unavailable, skipping orphan 
cleanup
[1798344.446949] EXT4-fs (dm-9): recovery complete
[1798344.446952] EXT4-fs error (device dm-9): ext4_mark_recovery_complete:6264: 
comm mount: Orphan file not empty on read-only fs.
[1798344.446970] EXT4-fs (dm-9): mount failed

I have been running this procedure daily on 30+ ext4 filesystems for 5+
years, and NEVER had the mount fail in this manner.

Also, I investigated some more after filing the bugreport, and lvcreate
is apparently supposed to freeze the filesystem while creating the
snapshot. Looking at ext4_freeze() in fs/ext4/super.c it seems like it's
marking the snapshot as not needing recovery unconditionally, but only
clearing orphan_present if the orphan file is empty. If I understand
this correctly, this might be the reason why it never fails without
orphan_file, but does fail with it.

> Finally, why are you creating read-only LVM snapshots?  The whole
> point of snapshots is that you can allow read-write snapshots, and
> they won't affect the original source volume.  Just create the
> snapshots without the "--permissions r" option, and then you can mount
> the file system with -o ro, and things will work fine.  Yes, we may
> need to replay the journal before continuing but that doesn't take a
> lot of disk space.  Using --size 100M should be *more* than enough for
> pretty much any file system.

For backups, I want the fs to be static. Back when I wrote this code, it
seemed reasonable to just make the snapshot read-only, so nothing
running on the system could modify it accidentally. As it has always
worked fine until now, I had no reason to dwell on it any further. I
suppose I drop the '--permission r' option and mount the snapshot -oro
instead, but the fact that any change is needed at all still strikes me
as not quite right.

Cheers,
-- 
Anton Khirnov

Reply via email to