On Wed, March 18, 2009 11:43, Bob Friesenhahn wrote: > On Wed, 18 Mar 2009, Joerg Schilling wrote: >> >> The problem in this case is not whether rename() is atomic but whether >> the >> file that replaces the old file in an atomic rename() operation is in a >> stable state on the disk before calling rename(). > > This topic is quite disturbing to me ... > >> The calling sequence of the failing code was: >> >> f = open("new", O_WRONLY|O_CREATE|O_TRUNC, 0666); >> write(f, "dat", size); >> close(f); >> rename("new", "old"); >> >> The only granted way to have the file "new" in a stable state on the >> disk >> is to call: >> >> f = open("new", O_WRONLY|O_CREATE|O_TRUNC, 0666); >> write(f, "dat", size); >> fsync(f); >> close(f); > > But the problem is not that the file "new" is in an unstable state. > The problem is that it seems that some filesystems are not preserving > the ordering of requests. Failing to preserve the ordering of > requests is fraught with peril.
Only in very limited cases. For example, writing the blocks of a file can occur in any order, so long as no block is written twice and so long as no reads are performed. It simply doesn't matter what order that goes to disk in. As soon as somebody reads one of the blocks written, then some of the ordering becomes important. You're trying, I think, to argue from first principles; may I suggest that a lot is known about filesystem (and database) semantics, and that we will get further if we work within what's already known about that, rather than trying to reinvent the wheel from scratch? > > POSIX does not care about "disks" or "filesystems". The only correct > behavior is for operations to be applied in the order that they are > requested of the operating system. This is a core function of any > operating system. Is this what it actually says in the POSIX documents? Or in any other filesystem formal definition? -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss