Re: [zfs-discuss] Re: ZFS RAID10

Neil Perrin Thu, 10 Aug 2006 11:24:20 -0700

Robert Milkowski wrote:

Hello Neil,


Thursday, August 10, 2006, 7:02:58 PM, you wrote:

NP> Robert Milkowski wrote:

Hello Matthew,

Thursday, August 10, 2006, 6:55:41 PM, you wrote:

MA> On Thu, Aug 10, 2006 at 06:50:45PM +0200, Robert Milkowski wrote:

btw: wouldn't it be possible to write block only once (for synchronous
IO) and than just point to that block instead of copying it again?



MA> We actually do exactly that for larger (>32k) blocks.

Why such limit (32k)?



NP> By experimentation that was the cutoff where it was found to be
NP> more efficient. It was recently reduced from 64K with a more
NP> efficient dmu-sync() implementaion.
NP> Feel free to experiment with the dynamically changable tunable:

NP> ssize_t zfs_immediate_write_sz = 32768;


I've just checked using dtrace on one of production nfs servers that
90% of the time arg5 in zfs_log_write() is exactly 32768 and the rest
is always smaller.

With default 32768 value of 32768 it means that for NFS servers it
will always copy data as I've just checked in the code and there is:

245     if (len > zfs_immediate_write_sz) {

So in nfs server case above never will be true (with default nfs srv
settings).

Wouldn't nfs server benefit from lowering zfs_immediate_write_sz to
32767?


Yes NFS (with default 32K max write sz) would benefit if WR_INDIRECT
writes (using dmu_sync()) were faster, but that wasn't the case when
last benchmarked. I'm sure there are some cases currently where
tuning zfs_immediate_write_sz will help certain workloads.
Anyway, I think this whole area deserves more thought.
If you experiment with tuning zfs_immediate_write_sz, then please share
any performance data for your application/benchmark(s).

Thanks: Neil
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: ZFS RAID10

Reply via email to