Here's some sample output. Where the I write over NFS to ZFS (no
iscsi) I get high sizes for i/o:

 UID   PID D    BLOCK   SIZE       COMM PATHNAME
   1   427 W 22416320   4096       nfsd <none>
   1   427 W 22416328   4096       nfsd <none>
   1   427 W 22416336   4096       nfsd <none>
   1   427 W 22416344   4096       nfsd <none>
   1   427 W 22474784  16384       nfsd <none>
   1   427 W 22474816  16384       nfsd <none>
   1   427 W 22474848  16384       nfsd <none>
   1   427 W 22416352   4096       nfsd <none>
   1   427 W 22416360   4096       nfsd <none>
   1   427 W 22416368   4096       nfsd <none>
   1   427 W 22416376   4096       nfsd <none>
   1   427 W 22416384   4096       nfsd <none>
   1   427 W 22416392   4096       nfsd <none>
   1   427 W 22416400   4096       nfsd <none>
   1   427 W 22416408   4096       nfsd <none>
   1   427 W 22416416   4096       nfsd <none>
   1   427 W 22416424   4096       nfsd <none>
   1   427 W 22416432   4096       nfsd <none>
   1   427 W 22416440   4096       nfsd <none>
   1   427 W 22416448   4096       nfsd <none>
   1   427 W 22416456   4096       nfsd <none>
   1   427 W 22039040   8192       nfsd <none>
   1   427 W 22039056   8192       nfsd <none>
   1   427 W 22039072   8192       nfsd <none>
   1   427 W 22416464   4096       nfsd <none>
   1   427 W 22416472   4096       nfsd <none>
   1   427 W 22416480   4096       nfsd <none>
   1   427 W 22416488   4096       nfsd <none>
   1   427 W 22416496   4096       nfsd <none>
   1   427 W 22416504   4096       nfsd <none>
   1   427 W 22416512   4096       nfsd <none>
   1   427 W 22416520   4096       nfsd <none>
   1   427 W 22416528   4096       nfsd <none>
   1   427 W 22416536   4096       nfsd <none>
   1   427 W 22416544   4096       nfsd <none>
   1   427 W 22416552   4096       nfsd <none>
   1   427 W 22416560   4096       nfsd <none>
   1   427 W 22416568   4096       nfsd <none>
   1   427 W 22416576   4096       nfsd <none>
   1   427 W 22416584   4096       nfsd <none>
   1   427 W 22416592   4096       nfsd <none>
   1   427 W 22416600   4096       nfsd <none>
   1   427 W 22416608   4096       nfsd <none>
   1   427 W 22416616   4096       nfsd <none>
   1   427 W 22416624   4096       nfsd <none>
   1   427 W 22416632   4096       nfsd <none>
   1   427 W 22416640  12288       nfsd <none>
   1   427 W 22416664  12288       nfsd <none>
   1   427 W 22416688  12288       nfsd <none>
   1   427 W 22416712   4096       nfsd <none>
   1   427 W 22416720   4096       nfsd <none>
   1   427 W 22416728   4096       nfsd <none>
   1   427 W 22416736   4096       nfsd <none>
   1   427 W 22416744   4096       nfsd <none>
   1   427 W 22416752   4096       nfsd <none>
   1   427 W 22416760  36864       nfsd <none>
   0     0 W 22416832  53248      sched <none>
   1   427 W 22416936  53248       nfsd <none>
   1   427 W 22417040  53248       nfsd <none>
   1   427 W 22417144   4096       nfsd <none>
   1   427 W 22417152   4096       nfsd <none>
   1   427 W 22417160   4096       nfsd <none>
   1   427 W 22417168   4096       nfsd <none>
   1   427 W 22417176   4096       nfsd <none>
   1   427 W 22417184   4096       nfsd <none>
   1   427 W 22417192  12288       nfsd <none>
   1   427 W 22417216  12288       nfsd <none>
   1   427 W 22417240  12288       nfsd <none>
   1   427 W 22417264   4096       nfsd <none>
   1   427 W 22417272   4096       nfsd <none>
   1   427 W 22417280   4096       nfsd <none>

In the iscsi-backed case, I get:
  1   427 W 369124510    512       nfsd <none>
   1   427 W 369124510    512       nfsd <none>
   1   427 W 369124510    512       nfsd <none>
   1   427 W 369124510    512       nfsd <none>
   1   427 W 369124510   1024       nfsd <none>
   1   427 W 369124564   1024       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564    512       nfsd <none>
   1   427 W 369124564   1024       nfsd <none>
   1   427 W 369124565   1024       nfsd <none>
   1   427 W 369124566    512       nfsd <none>
   1   427 W 369124566    512       nfsd <none>
   1   427 W 369124565    512       nfsd <none>
   1   427 W 369124565    512       nfsd <none>
   1   427 W 369124565    512       nfsd <none>
   1   427 W 369124565    512       nfsd <none>
   1   427 W 369124565   1024       nfsd <none>
   1   427 W 369124566   1024       nfsd <none>
   1   427 W 369124567    512       nfsd <none>
   1   427 W 369124567    512       nfsd <none>
   1   427 W 369124567    512       nfsd <none>
   1   427 W 369124567    512       nfsd <none>
   1   427 W 369124566    512       nfsd <none>
   1   427 W 369124566    512       nfsd <none>
   1   427 W 369124566   1024       nfsd <none>
   1   427 W 369124567   1024       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124568    512       nfsd <none>
   1   427 W 369124567   1024       nfsd <none>
   1   427 W 369124569   1024       nfsd <none>
   1   427 W 369124569    512       nfsd <none>

Looks to me the bulk of my problem is poor block size scheduling. Is
this tuneable for either ZFS or NFS and/or can be set?

On 5/5/06, Lisa Week <[EMAIL PROTECTED]> wrote:
These may help:

http://opensolaris.org/os/community/dtrace/scripts/
    Check out iosnoop.d

http://www.solarisinternals.com/si/dtrace/index.php
    Check out iotrace.d

- Lisa

Joe Little wrote On 05/05/06 18:59,:

> Are there known i/o or iscsi dtrace scripts available?
>
> On 5/5/06, Spencer Shepler <[EMAIL PROTECTED]> wrote:
>
>> On Fri, Joe Little wrote:
>> > On 5/5/06, Eric Schrock <[EMAIL PROTECTED]> wrote:
>> > >On Fri, May 05, 2006 at 03:46:08PM -0700, Joe Little wrote:
>> > >> Thanks for the tip. In the local case, I could send to the
>> > >> iSCSI-backed ZFS RAIDZ at even faster rates, with a total
>> elapsed time
>> > >> of 50seconds (17 seconds better than UFS). However, I didn't
>> even both
>> > >> finishing the NFS client test, since it was taking a few seconds
>> > >> between multiple 27K files. So, it didn't help NFS at all. I'm
>> > >> wondering if there is something on the NFS end that needs changing,
>> > >> no?
>> > >
>> > >Keep in mind that turning off this flag may corrupt on-disk state
>> in the
>> > >event of power loss, etc.  What was the delta in the local case?  17
>> > >seconds better than UFS, but percentage wise how much faster than the
>> > >original?
>> > >
>> >
>> > I believe it was only about 5-10% faster. I don't have the time
>> > results off hand, just some dtrace latency reports.
>> >
>> > >NFS has the property that it does an enormous amount of synchronous
>> > >activity, which can tickle interesting pathologies.  But it's strange
>> > >that it didn't help NFS that much.
>> >
>> > Should I also mount via async.. would this be honored on the Solaris
>> > end? The other option mentioned with similar caveats was nocto. I just
>> > tried with both, and the observed transfer rate was about 1.4k/s. It
>> > was painful deleting the 3G directory via NFS, with about 100k/s
>> > deletion rate on these 1000 files. Of course, When I went locally the
>> > delete was instantaneous.
>>
>> I wouldn't change any of the options at the client.  The issue
>> is at the server side and none of the other combinations that you
>> originally pointed out have this problem, right?  Mount options at the
>> client will just muddy the waters.
>>
>> We need to understand if/what the NFS/ZFS/iscsi interaction is and why
>> it is so much worse.  As Eric mentioned, there may be some interesting
>> pathologies at play here and we need to understand what they are so
>> they can be addressed.
>>
>> My suggestion is additional dtrace data collection but I don't have
>> a specific suggestion as to how/what to track next.
>> Because of the significant additional latency, I would be looking for
>> big increases in the number of I/Os being generated to the iscsi backend
>> as compared to the local attached case.  I would also look for
>> some type of serialization of I/Os that is occurring with iscsi vs.
>> the local attach.
>>
>> Spencer
>>
> _______________________________________________
> nfs-discuss mailing list
> [EMAIL PROTECTED]


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to