>>>>> "re" == Richard Elling <[EMAIL PROTECTED]> writes: >>>>> "kb" == Keith Bierman <[EMAIL PROTECTED]> writes:
re> the disk lies about the persistence of the data. ZFS knows re> disks lie, so it sends sync commands when necessary (1) i don't think ``lie'' is a correct characerization given that the sync commands exist, but point taken about the other area of risk. I suspect there may be similar problems in ZFS's write path when one is using iSCSI targets. Or it's just common for iSCSI target implementations to suck (lie). or maybe it's something else I'm seeing. (2) i thought the recommendation that one give ZFS whole disks and let it put EFI labels on them came from the Solaris behavior that, only in a whole-disk-for-zfs configuration, will the Solaris drivers refrain from explicitly disabling the write cache in these inexpensive disks. so the cache shouldn't be a problem for UFS, but it might be for non-Solaris operating systems (even for ZFS on platforms where ZFS is ported but the SYNCHRONIZE CACHE commands don't make it through some mid-layer or CAM or driver). kb> Aye, but isn't that the real rub ... when the power fails kb> after the write but *before* the fsync has occurred... no, there is no rub here, I was only speaking precisely. A proper DBMS (anything except MySQL) is also designed to understand that power failures happen. It does its writes in a deliberate order such that it won't return success to the application calling it until it gets the return from fsync(), and also so that the system will never recover such that a transaction is half-completed. re> the ZFS on-disk format is such that you can recover to a point re> in time where the file system is consistent. do you mean taht, ``after a power outage ZFS will always recover the filesystem to a state that it passed through in the moments leading up to the outage,'' while UFS, which logs only metadata, typically recovers to some state the filesystem never passed through---but it never loses fsync()ed data nor data that wasn't written ``recently'' before the crash? For casual filesystem use, or for applications that weren't designed with cord-pulling in mind, ZFS's guarantee is larger and more comforting. But for databases, I don't think the distinction matters because they call fsync() at deliberate moments and do their own copy-on-write logging above the filesystem, so they provide the same consistency guarantees whether operating on UFS or ZFS. It would be fine to feed a database the type of hacked non-CoW zvol that's used for swap, if fsync could be made to work there, which maybe it can't.
pgpUl3DbdgW5f.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss