On Apr 19, 2010, at 7:11 PM, Bob Friesenhahn wrote: > On Mon, 19 Apr 2010, Edward Ned Harvey wrote: >> Improbability assessment aside, suppose you use something like the DDRDrive >> X1 ... Which might be more like 4G instead of 32G ... Is it even physically >> possible to write 4G to any device in less than 10 seconds? Remember, to >> achieve worst case, highest demand on ZIL log device, these would all have >> to be <32kbyte writes (default configuration), because larger writes will go >> directly to primary storage, with only the intent landing on the ZIL. > > Note that ZFS always writes data in order so I believe that the statement > "larger writes will go directly to primary storage" really should be "larger > writes will go directly to the ZIL implemented in primary storage (which > always exists)". Otherwise, ZFS would need to write a new TXG whenever a new > "large" block of data appeared (which may be puny as far as the underlying > store is concerned) in order to assure proper ordering. This would result in > a very high TXG issue rate. Pool fragmentation would be increased. > > I am sure that someone will correct me if this is wrong.
Actually, when (you are not using a separate log device and the block size is > 32kB) or (you are using a separate log and logbias=throughput) then the data is written once to the main pool and a reference record is written to the ZIL. When the txg commits, the reference record is discarded and the committed block pointer is correct. Upon rollback, all you need is the real data and the reference record from the ZIL to reconstruct. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss