>Questions to answer would be: > >Is a ZIL log device used only by sync() and fsync() system calls? Is it >ever used to accelerate async writes?
There are quite a few of "sync" writes, specifically when you mix in the NFS server. >Suppose there is an application which sometimes does sync writes, and >sometimes async writes. In fact, to make it easier, suppose two processes >open two files, one of which always writes asynchronously, and one of which >always writes synchronously. Suppose the ZIL is disabled. Is it possible >for writes to be committed to disk out-of-order? Meaning, can a large block >async write be put into a TXG and committed to disk before a small sync >write to a different file is committed to disk, even though the small sync >write was issued by the application before the large async write? Remember, >the point is: ZIL is disabled. Question is whether the async could >possibly be committed to disk before the sync. What I quoted from the other discussion, it seems to be that later writes cannot be committed in an earlier TXG then your sync write or other earlier writes. >I make the assumption that an uberblock is the term for a TXG after it is >committed to disk. Correct? The "uberblock" is the "root of all the data". All the data in a ZFS pool is referenced by it; after the txg is in stable storage then the uberblock is updated. >At boot time, or "zpool import" time, what is taken to be "the current >filesystem?" The latest uberblock? Something else? The current "zpool" and the filesystems such as referenced by the last uberblock. >My understanding is that enabling a dedicated ZIL device guarantees sync() >and fsync() system calls block until the write has been committed to >nonvolatile storage, and attempts to accelerate by using a physical device >which is faster or more idle than the main storage pool. My understanding >is that this provides two implicit guarantees: (1) sync writes are always >guaranteed to be committed to disk in order, relevant to other sync writes. >(2) In the event of OS halting or ungraceful shutdown, sync writes committed >to disk are guaranteed to be equal or greater than the async writes that >were taking place at the same time. That is, if two processes both complete >a write operation at the same time, one in sync mode and the other in async >mode, then it is guaranteed the data on disk will never have the async data >committed before the sync data. sync() is actually *async* and returning from sync() says nothing about stable storage. After fsync() returns it signals that all the data is in stable storage (except if you disable ZIL), or, apparently, in Linux when the write caches for your disks are enabled (the default for PC drives). ZFS doesn't care about the writecache; it makes sure it is flushed. (There's fsyc() and open(..., O_DSYNC|O_SYNC) >Based on this understanding, if you disable ZIL, then there is no guarantee >about order of writes being committed to disk. Neither of the above >guarantees is valid anymore. Sync writes may be completed out of order. >Async writes that supposedly happened after sync writes may be committed to >disk before the sync writes. > >Somebody, (Casper?) said it before, and now I'm starting to realize ... This >is also true of the snapshots. If you disable your ZIL, then there is no >guarantee your snapshots are consistent either. Rolling back doesn't >necessarily gain you anything. > >The only way to guarantee consistency in the snapshot is to always >(regardless of ZIL enabled/disabled) give priority for sync writes to get >into the TXG before async writes. > >If the OS does give priority for sync writes going into TXG's before async >writes (even with ZIL disabled), then after spontaneous ungraceful reboot, >the latest uberblock is guaranteed to be consistent. I believe that the writes are still ordered so the consistency you want is actually delivered even without the ZIL enabled. Casper _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss