On Jul 29, 2012, at 1:53 PM, Jim Klimov wrote: > 2012-07-30 0:40, opensolarisisdeadlongliveopensolaris пишет: >>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >>> boun...@opensolaris.org] On Behalf Of Jim Klimov >>> >>> For several times now I've seen statements on this list implying >>> that a dedicated ZIL/SLOG device catching sync writes for the log, >>> also allows for more streamlined writes to the pool during normal >>> healthy TXG syncs, than is the case with the default ZIL located >>> within the pool. >> >> It might just be more clear, if it's stated differently: >> >> At any given time, your pool is in one of four states: idle, reading, >> writing, or idle with writes queued but not currently being written. Now a >> sync write operation takes place. If you have a dedicated log, it goes >> directly to the log, and it doesn't interfere with any of the other >> operations that might be occurring right now. You don't have to interrupt >> your current activity, simply, your sync write goes to a dedicated device >> that's guaranteed to be idle in relation to all that other stuff. Then the >> sync write becomes async, and gets coalesced into the pending TXG. >> >> If you don't have a dedicated log, then the sync write jumps the write >> queue, and becomes next in line. It waits for the present read or write >> operation to complete, and then the sync write hits the disk, and flushes >> the disk buffer. This means the sync write suffered a penalty waiting for >> the main pool disks to be interruptible. Without slog, you're causing delay >> to your sync writes, and you're causing delay before the next read or write >> operation can begin... But that's it. Without slog, your operations are >> serial, whereas, with slog your sync write can occur in parallel to your >> other operations. >> >> There's no extra fragmentation, with or without slog. Because in either >> case, the sync write hits some dedicated and recyclable disk blocks, and >> then it becomes async and coalesced with all the other async writes. The >> layout and/or fragmentation characteristics of the permanent TXG to be >> written to the pool is exactly the same either way. > > Thanks... but doesn't your description imply that the sync writes > would always be written twice? It should be with dedicated SLOG, but > even with one, I think, small writes hit the SLOG and large ones > go straight to the pool devices (and smaller blocks catch up from > the TXG queue upon TXG flush). However, without a dedicated SLOG, > I thought that the writes into the ZIL happen once on the main > pool devices, and then are referenced from the live block pointer > tree without being rewritten elsewhere (and for the next TXG some > other location may be used for the ZIL). Maybe I am wrong, because > it would also make sense for small writes to hit the disk twice > indeed, and the same pool location(s) being reused for the ZIL.
You are both right and wrong, at the same time. It depends on the data. Without a slog, writes that are larger than zfs_immediate_write_sz are written to the permanent place in the pool. Please review (again) my slides on the subject. http://www.slideshare.net/relling/zfs-tutorial-lisa-2011 slide 78. For those who prefer to be lecturered, another opportunity will arise in December 2012 in San Diego at the LISA'12 conference.. I am revamping much of the material from 2011, to catch up with all of the cool new things that arrived and are due this year. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss