Thanks for the suggestions.  I think it would also depend on whether
the nfs server has tried to write asynchronously to the pool in the
meantime, which I am unsure how to test, other than making the txgs
extremely frequent and watching the load on the log devices.  As for
the integer division giving misleading zeros, one possible solution is
to add (delay-1) to the count before dividing by delay, so if there
are any, it will show at least 1 (or you could get fancy and do fixed
point numbers).

As for very frequent txgs, I imagine this could cause more
fragmentation (more metadata written and discarded more frequently),
is there a way to estimate or test for the impact of it?  Depending on
how it allocates the metadata blocks, I suppose it could write it to
the blocks recently vacated by old metadata due to the previous txg,
and have almost no impact until a snapshot is taken, is it smart
enough to do this?

Tim

On Fri, Jun 15, 2012 at 10:56 AM, Richard Elling
<richard.ell...@gmail.com> wrote:
> [Phil beat me to it]
> Yes, the 0s are a result of integer division in DTrace/kernel.
>
> On Jun 14, 2012, at 9:20 PM, Timothy Coalson wrote:
>
>> Indeed they are there, shown with 1 second interval.  So, it is the
>> client's fault after all.  I'll have to see whether it is somehow
>> possible to get the server to write cached data sooner (and hopefully
>> asynchronous), and the client to issue commits less often.  Luckily I
>> can live with the current behavior (and the SSDs shouldn't give out
>> any time soon even being used like this), if it isn't possible to
>> change it.
>
> If this is the proposed workload, then it is possible to tune the DMU to
> manage commits more efficiently. In an ideal world, it does this 
> automatically,
> but the algorithms are based on a bandwidth calculation and those are not
> suitable for HDD capacity planning. The efficiency goal would be to do less
> work, more often and there are two tunables that can apply:
>
> 1. the txg_timeout controls the default maximum transaction group commit
> interval and is set to 5 seconds on modern ZFS implementations.
>
> 2. the zfs_write_limit is a size limit for txg commit. The idea is that a txg 
> will
> be committed when the size reaches this limit, rather than waiting for the
> txg_timeout. For streaming writes, this can work better than tuning the
> txg_timeout.
>
>  -- richard
>
>>
>> Thanks for all the help,
>> Tim
>>
>> On Thu, Jun 14, 2012 at 10:30 PM, Phil Harman <phil.har...@gmail.com> wrote:
>>> On 14 Jun 2012, at 23:15, Timothy Coalson <tsc...@mst.edu> wrote:
>>>
>>>>> The client is using async writes, that include commits. Sync writes do not
>>>>> need commits.
>>>>
>>>> Are you saying nfs commit operations sent by the client aren't always
>>>> reported by that script?
>>>
>>> They are not reported in your case because the commit rate is less than one 
>>> per second.
>>>
>>> DTrace is an amazing tool, but it does dictate certain coding compromises, 
>>> particularly when it comes to output scaling, grouping, sorting and 
>>> formatting.
>>>
>>> In this script the commit rate is calculated using integer division. In 
>>> your case the sample interval is 5 seconds, so up to 4 commits per second 
>>> will be reported as a big fat zero.
>>>
>>> If you use a sample interval of 1 second you should see occasional commits. 
>>> We know they are there because we see a non-zero commit time.
>>>
>>>
>
> --
>
> ZFS and performance consulting
> http://www.RichardElling.com
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to