[Bug 317781] Re: Ext4 data loss

Chris Newman Thu, 12 Mar 2009 14:46:19 -0700

@Theodore,

As a scalable server developer with 25 years experience, I am fully
aware of the purpose of fsync, fdatasync and use them if and only if the
semantics I want are "really commit to disk right now".  To use them at
any other time would be an implementation error.


I further agree delayed allocation is a good thing and believe
application developers who use the first command sequence you describe
above get what they deserve and that is it a mistake for the filesystem
to perform an implicit sync in that case.

Where I strongly disagree with you is for the open-write-close-rename
call sequence (your second scenario).  It is very common for an
application to need "atomic replace, defer ok" semantics when updating a
file (more common, in fact, than cases where fsync is really needed).
The only way to express that semantic is open-write-close-rename, and
furthermore that semantic is the only useful interpretation of that call
sequence.  Adding an fsync expresses a different and less useful
semantic.  For example, when I do "atomic replace, defer ok" twice in a
flush interval I would expect an optimal filesystem to discard the
intermediate version without ever committing it to disk.  So I find the
workaround you've implemented undesirable as it results in non-optimal
and unnecessary disk commits.

Now your not-useful interpretation of open-write-close-rename is Posix
compliant under a narrow interpretation.  But I can interpret any
standard in a not-useful way.  An IMAP server that delivers all new mail
to a mailbox "NEWMAIL" and has no "INBOX" would be strictly compliant
with the spec and also not useful.  Any reasonable IMAP client vendor
will simply state they don't support that server.  And that's exactly
what will happen to EXT4, XFS and other filesystems that interpret the
open-write-close-rename call sequence in a not useful way.  You will
find applications declare your filesystem unsupported because you
interpret a useful call sequence in a not-useful fashion.

The right interpretation of open-write-close-rename is "atomic replace,
defer ok".  There is no reason to spin up the disk or fsync until the
next flush interval.  What's important is that the rename is not
committed until after the file data is committed.

If you disagree, I invite you to suggest how you would express "atomic
replace, defer ok" using Posix APIs when writing an application.

-- 
Ext4 data loss
https://bugs.launchpad.net/bugs/317781
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 317781] Re: Ext4 data loss

Reply via email to