On 06/14/10 19:35, Erik Trimble wrote:
On 6/14/2010 12:10 PM, Neil Perrin wrote:
On 06/14/10 12:29, Bob Friesenhahn wrote:
On Mon, 14 Jun 2010, Roy Sigurd Karlsbakk wrote:

It is good to keep in mind that only small writes go to the dedicated
slog. Large writes to to main store. A succession of that many small
writes (to fill RAM/2) is highly unlikely. Also, that the zil is not
read back unless the system is improperly shut down.

I thought all sync writes, meaning everything NFS and iSCSI, went into the slog - IIRC the docs says so.

Check a month or two back in the archives for a post by Matt Ahrens. It seems that larger writes (>32k?) are written directly to main store. This is probably a change from the original zfs design.

Bob

If there's a slog then the data, regardless of size, gets written to the slog.

If there's no slog and if the data size is greater than zfs_immediate_write_sz/zvol_immediate_write_sz (both default to 32K) then the data is written as a block into the pool and the block pointer
written into the log record. This is the WR_INDIRECT write type.

So Matt and Roy are both correct.

But wait, there's more complexity!:

If logbias=throughput is set we always use WR_INDIRECT.

If we just wrote more than 1MB for a single zil commit and there's more than 2MB waiting
then we start using the main pool.

Clear as mud?  This is likely to change again...

Neil.


How do I monitor the amount of live (i.e. non-committed) data in the slog? I'd like to spend some time with my setup, seeing exactly how much I tend to use.

I think monitoring the capacity when running "zpool iostat -v <pool> 1" should be fairly accurate. A simple d script can be written to determine how often the ZIL (code) fails to get a slog block and
has to resort to the allocation in the main pool.

One recent change reduced the amount of data written and possibly the slog block fragmentation.
This is zpool version 23: "Slim ZIL". So be sure to experiment with that.



I'd suspect that very few use cases call for more than a couple (2-4) GB of slog...

I agree this is typically true. Of course it depends on your workload. The amount slog data will reflect the uncommitted synchronous txg data, and the size of each txg will depend on memory size.
This area is also undergoing tuning.

I'm trying to get hard numbers as I'm working on building a DRAM/battery/flash slog device in one of my friend's electronics prototyping shops. It would be really nice if I could solve 99% of the need with 1 or 2 2GB SODIMMs and the chips from a cheap 4GB USB thumb drive...


Sounds like fun. Good luck.

Neil.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to