Re: [zfs-discuss] write cache and cache flush

Jim Mauro Fri, 30 Jan 2009 07:29:39 -0800

You have SSD's for the ZIL (logzilla) enabled, and ZIL IO
is what is hurting your performance...Hmmm....


I'll ask the stupid question (just to get it out of the way) - is
it possible that the logzilla is undersized?

Did you gather data using Richard Elling's zilstat (included below)?

Thanks,
/jim


#! /usr/bin/ksh -p
# CDDL HEADER START
#
# The contents of this file are subject to the terms of the
# Common Development and Distribution License (the "License").
# You may not use this file except in compliance with the License.
#
# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
# or http://www.opensolaris.org/os/licensing.
# See the License for the specific language governing permissions
# and limitations under the License.
#
# When distributing Covered Code, include this CDDL HEADER in each
# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
# If applicable, add the following below this CDDL HEADER, with the
# fields enclosed by brackets "[]" replaced with your own identifying
# information: Portions Copyright [yyyy] [name of copyright owner]
#
# CDDL HEADER END
# Portions Copyright 2009 Sun Microsystems, Inc.
#
#File:   zilstat.d
#Author: richard.ell...@sun.com
#
# This dtrace program will help identify the ZIL activity by sampling
# writes sent to the ZIL.
# output:
#     [TIME]
#     BYTES - total bytes written to ZIL over the interval
#     BYTES/S - bytes/s written to ZIL over ther interval
#     MAX-BYTES/S - maximum rate during any 1-second sample
#
##############################
# --- Process Arguments ---
#

# TODO: clean up args
### default variables
opt_pool=0
opt_time=0
filter=0
pool=
lines=-1
interval=1
count=-1

### process options
while getopts hl:p:t name
do
        case $name in
        l)  lines=$OPTARG ;;
        p)  opt_pool=1; pool=$OPTARG ;;
        t)  opt_time=1 ;;
        h|?)    ME=$(basename $0)
                cat <<-END >&2
                USAGE: $ME [t][-l linecount] [-p poolname] [interval [count]]
    -t   # print timestamp
    -l linecount    # print header every linecount lines (default=only once)
    -d poolname      # only look at poolname
    -l number      # print header every number lines

    examples:
        $ME # default output, 1 second samples
        $ME 10  # 10 second samples
        $ME 10 6    # print 6 x 10 second samples
        $ME -p rpool    # show ZIL stats for rpool only

    output:
        [TIME]
        BYTES - total bytes written to ZIL over the interval
        BYTES/S - bytes/s written to ZIL over ther interval
        MAX-BYTES/S - maximum rate during any 1-second sample
                END
                exit 1
        esac
done

shift $(( $OPTIND - 1 ))

### option logic
if [[ "$1" > 0 ]]; then
        interval=$1; shift
fi
if [[ "$1" > 0 ]]; then
        count=$1; shift
fi
if (( opt_pool )); then
        filter=1
fi

##############################
# --- Main Program, DTrace ---

/usr/sbin/dtrace -n '
#pragma D option quiet
 inline int OPT_time = '$opt_time';
 inline int OPT_pool = '$opt_pool';
 inline int INTERVAL = '$interval';
 inline int LINES = '$lines';
 inline int COUNTER = '$count';
 inline int FILTER = '$filter';
 inline string POOL = "'$pool'";
 dtrace:::BEGIN
 {
        /* starting values */
        counts = COUNTER;
        secs = INTERVAL;
        line = 0;
        last_event[""] = 0;
        nused=0;
        max_per_sec=0;
        nused_per_sec=0;
 }

 /*
  * collect info when zil_lwb_write_start fires
  */
 fbt::zil_lwb_write_start:entry
 /OPT_pool == 0 || POOL == args[0]->zl_dmu_pool->dp_spa->spa_name/
{
     nused += args[1]->lwb_nused;
     nused_per_sec += args[1]->lwb_nused;
}

 /*
  * Timer
  */
 profile:::tick-1sec
 {
        secs--;
        nused_per_sec > max_per_sec ? max_per_sec = nused_per_sec : 1;
        nused_per_sec = 0;
 }

 /*
  * Print header
  */
 profile:::tick-1sec
 /line == 0 /
 {
        /* print optional headers */
        OPT_time   ? printf("%-20s ", "TIME")  : 1;

        /* print header */
        printf("%10s %10s %10s\n", "BYTES", "BYTES/S", "MAX-BYTES/S");
        line = LINES;
 }

 /*
  * Print Output
  */
 profile:::tick-1sec
 /secs == 0/
 {
        OPT_time   ? printf("%-20Y ", walltimestamp) : 1;
        printf("%10d %10d %10d\n", nused, nused/INTERVAL, max_per_sec);

        nused = 0;
        nused_per_sec = 0;
        max_per_sec = 0;
        secs = INTERVAL;
        counts--;
        line--;
 }

 /*
  * End of program
  */
 profile:::tick-1sec
 /counts == 0/
 {
        exit(0);
 }
'



Greg Mason wrote:
>> If there was a latency issue, we would see such a problem with our 
>> existing file server as well, which we do not. We'd also have much 
>> greater problems than just file server performance.
>>
>> So, like I've said, we've ruled out the network as an issue.
>
> I should also add that I've tested these Thors with the ZIL disabled, 
> and they scream! With the cache flush disabled, they also do quite well.
>
> The specific issue i'm trying to solve is the ZIL being slow when 
> using NFS.
>
> I really don't want to have to do something drastic like disabling the 
> ZIL to get the performance I need...
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] write cache and cache flush

Reply via email to