On 06/14/10 19:35, Erik Trimble wrote:
On 6/14/2010 12:10 PM, Neil Perrin wrote:
On 06/14/10 12:29, Bob Friesenhahn wrote:
On Mon, 14 Jun 2010, Roy Sigurd Karlsbakk wrote:
It is good to keep in mind that only small writes go to the dedicated
slog. Large writes to to main store. A succession of that many small
writes (to fill RAM/2) is highly unlikely. Also, that the zil is not
read back unless the system is improperly shut down.
I thought all sync writes, meaning everything NFS and iSCSI, went
into the slog - IIRC the docs says so.
Check a month or two back in the archives for a post by Matt Ahrens.
It seems that larger writes (>32k?) are written directly to main
store. This is probably a change from the original zfs design.
Bob
If there's a slog then the data, regardless of size, gets written to
the slog.
If there's no slog and if the data size is greater than
zfs_immediate_write_sz/zvol_immediate_write_sz
(both default to 32K) then the data is written as a block into the
pool and the block pointer
written into the log record. This is the WR_INDIRECT write type.
So Matt and Roy are both correct.
But wait, there's more complexity!:
If logbias=throughput is set we always use WR_INDIRECT.
If we just wrote more than 1MB for a single zil commit and there's
more than 2MB waiting
then we start using the main pool.
Clear as mud? This is likely to change again...
Neil.
How do I monitor the amount of live (i.e. non-committed) data in the
slog? I'd like to spend some time with my setup, seeing exactly how
much I tend to use.
I think monitoring the capacity when running "zpool iostat -v <pool> 1"
should be fairly accurate.
A simple d script can be written to determine how often the ZIL (code)
fails to get a slog block and
has to resort to the allocation in the main pool.
One recent change reduced the amount of data written and possibly the
slog block fragmentation.
This is zpool version 23: "Slim ZIL". So be sure to experiment with that.
I'd suspect that very few use cases call for more than a couple (2-4)
GB of slog...
I agree this is typically true. Of course it depends on your workload.
The amount slog data will reflect the
uncommitted synchronous txg data, and the size of each txg will depend
on memory size.
This area is also undergoing tuning.
I'm trying to get hard numbers as I'm working on building a
DRAM/battery/flash slog device in one of my friend's electronics
prototyping shops. It would be really nice if I could solve 99% of
the need with 1 or 2 2GB SODIMMs and the chips from a cheap 4GB USB
thumb drive...
Sounds like fun. Good luck.
Neil.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss