>>>>> "jd" == Jim Dunham <james.dun...@sun.com> writes:
jd> It is my understanding that the ZFS intent log (ZIL) satisfies jd> POSIX requirements for synchronous transactions, thus jd> filesystem consistency. maybe ``file consistency'' would be clearer. When you say filesystem consistency people imagine their pools won't import, which I think isn't what you're talking about. Databases rely on the ZIL to keep their data files internally consistent, and MTA's to keep their queue directories consistent: ``file consistency'' meaning the insides of a file must be consistent with the rest of the insides of the same file, and they won't be without the ZIL. so, for example, in an imaginary better world where virtual machine software didn't break all kinds of sync and barrier rules and the ZIL were the only issue, then disabling the ZIL on the Host could cause the filesystems of virtual Guests to become inconsistent and refuse to import or need drastic fsck if the Host lost power, or in the SNDR-replicated copy of the Host, but the Host filesystem and its replica would always stay clean and mountable with or without the ZIL. The ZIL is stored on the disk, never in RAM as your earlier message suggested, so it should be replicated along with everything else, shouldn't it? unless you are using a slog and leave the slog outside replication, but in that case it should be impossible to import the pool on the secondary because importing with missing slogs doesn't work yet, so I'm not sure what's happening to you. Are you actually observing violation of POSIX consistency ``suggestions'' w.r.t. fsync() or O_DSYNC on the secondary? Or are you talking about close-to-open? Files that you close(), wait for the close to return, break replication, and the file does not appear on the secondary? What's breaking exactly? jd> A simple test I performed to verify this, was to append to a jd> ZFS file (no synchronous filesystem options being set) a jd> series of blocks with a block order pattern contained jd> within. At some random point in this process, I took a ZFS jd> snapshot, immediately dropped SNDR into logging mode. When jd> importing the ZFS storage pool on the SNDR remote host, I jd> could see the ZFS snapshot just taken, but neither the jd> snapshot version of the file, or the file itself contained all jd> of the data previously written to it. that's a really good test! so SNDR is good for testing, too, it seems. I'm glad you've done it. If we'd just listened to the several people speculating, ``just take a snapshot, it ought to imply a lockfs'' we could be having nasty surprises months from now. I'm also not that upset about the behavior, if it lets one take and destroy snapshots really fast. I could see the opposing argument that all snapshots should commit to disk atomically, though, because you are saying the snapshot _exists_ but doesn't have in it what it should---maybe in a more ideal world snapshot should either disappear after reboot, or else if it exists contain exactly what it logically should. jd> I then retested, but opened the file with O_DSYNC, and when jd> following the same test steps above, both the snapshot version jd> of the file, and the file itself contained all of the data jd> previously written to it. AIUI, in this test some of the file data may be written to the ZIL. In the former test, the ZIL would not be used at all. but the ZIL is just a separate area on the disk that's faster to write to, since with O_DSYNC or fsync() you would like to return to the application in a hurry. ZFS scribbles down the change as quickly as possible in the ZIL on the disk, then rewrites it in a more organized way later. -- READ CAREFULLY. By reading this fortune, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.
pgpsZsO8kcf9d.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss