Here's some sample output. Where the I write over NFS to ZFS (no iscsi) I get high sizes for i/o:
UID PID D BLOCK SIZE COMM PATHNAME 1 427 W 22416320 4096 nfsd <none> 1 427 W 22416328 4096 nfsd <none> 1 427 W 22416336 4096 nfsd <none> 1 427 W 22416344 4096 nfsd <none> 1 427 W 22474784 16384 nfsd <none> 1 427 W 22474816 16384 nfsd <none> 1 427 W 22474848 16384 nfsd <none> 1 427 W 22416352 4096 nfsd <none> 1 427 W 22416360 4096 nfsd <none> 1 427 W 22416368 4096 nfsd <none> 1 427 W 22416376 4096 nfsd <none> 1 427 W 22416384 4096 nfsd <none> 1 427 W 22416392 4096 nfsd <none> 1 427 W 22416400 4096 nfsd <none> 1 427 W 22416408 4096 nfsd <none> 1 427 W 22416416 4096 nfsd <none> 1 427 W 22416424 4096 nfsd <none> 1 427 W 22416432 4096 nfsd <none> 1 427 W 22416440 4096 nfsd <none> 1 427 W 22416448 4096 nfsd <none> 1 427 W 22416456 4096 nfsd <none> 1 427 W 22039040 8192 nfsd <none> 1 427 W 22039056 8192 nfsd <none> 1 427 W 22039072 8192 nfsd <none> 1 427 W 22416464 4096 nfsd <none> 1 427 W 22416472 4096 nfsd <none> 1 427 W 22416480 4096 nfsd <none> 1 427 W 22416488 4096 nfsd <none> 1 427 W 22416496 4096 nfsd <none> 1 427 W 22416504 4096 nfsd <none> 1 427 W 22416512 4096 nfsd <none> 1 427 W 22416520 4096 nfsd <none> 1 427 W 22416528 4096 nfsd <none> 1 427 W 22416536 4096 nfsd <none> 1 427 W 22416544 4096 nfsd <none> 1 427 W 22416552 4096 nfsd <none> 1 427 W 22416560 4096 nfsd <none> 1 427 W 22416568 4096 nfsd <none> 1 427 W 22416576 4096 nfsd <none> 1 427 W 22416584 4096 nfsd <none> 1 427 W 22416592 4096 nfsd <none> 1 427 W 22416600 4096 nfsd <none> 1 427 W 22416608 4096 nfsd <none> 1 427 W 22416616 4096 nfsd <none> 1 427 W 22416624 4096 nfsd <none> 1 427 W 22416632 4096 nfsd <none> 1 427 W 22416640 12288 nfsd <none> 1 427 W 22416664 12288 nfsd <none> 1 427 W 22416688 12288 nfsd <none> 1 427 W 22416712 4096 nfsd <none> 1 427 W 22416720 4096 nfsd <none> 1 427 W 22416728 4096 nfsd <none> 1 427 W 22416736 4096 nfsd <none> 1 427 W 22416744 4096 nfsd <none> 1 427 W 22416752 4096 nfsd <none> 1 427 W 22416760 36864 nfsd <none> 0 0 W 22416832 53248 sched <none> 1 427 W 22416936 53248 nfsd <none> 1 427 W 22417040 53248 nfsd <none> 1 427 W 22417144 4096 nfsd <none> 1 427 W 22417152 4096 nfsd <none> 1 427 W 22417160 4096 nfsd <none> 1 427 W 22417168 4096 nfsd <none> 1 427 W 22417176 4096 nfsd <none> 1 427 W 22417184 4096 nfsd <none> 1 427 W 22417192 12288 nfsd <none> 1 427 W 22417216 12288 nfsd <none> 1 427 W 22417240 12288 nfsd <none> 1 427 W 22417264 4096 nfsd <none> 1 427 W 22417272 4096 nfsd <none> 1 427 W 22417280 4096 nfsd <none> In the iscsi-backed case, I get: 1 427 W 369124510 512 nfsd <none> 1 427 W 369124510 512 nfsd <none> 1 427 W 369124510 512 nfsd <none> 1 427 W 369124510 512 nfsd <none> 1 427 W 369124510 1024 nfsd <none> 1 427 W 369124564 1024 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 512 nfsd <none> 1 427 W 369124564 1024 nfsd <none> 1 427 W 369124565 1024 nfsd <none> 1 427 W 369124566 512 nfsd <none> 1 427 W 369124566 512 nfsd <none> 1 427 W 369124565 512 nfsd <none> 1 427 W 369124565 512 nfsd <none> 1 427 W 369124565 512 nfsd <none> 1 427 W 369124565 512 nfsd <none> 1 427 W 369124565 1024 nfsd <none> 1 427 W 369124566 1024 nfsd <none> 1 427 W 369124567 512 nfsd <none> 1 427 W 369124567 512 nfsd <none> 1 427 W 369124567 512 nfsd <none> 1 427 W 369124567 512 nfsd <none> 1 427 W 369124566 512 nfsd <none> 1 427 W 369124566 512 nfsd <none> 1 427 W 369124566 1024 nfsd <none> 1 427 W 369124567 1024 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124568 512 nfsd <none> 1 427 W 369124567 1024 nfsd <none> 1 427 W 369124569 1024 nfsd <none> 1 427 W 369124569 512 nfsd <none> Looks to me the bulk of my problem is poor block size scheduling. Is this tuneable for either ZFS or NFS and/or can be set? On 5/5/06, Lisa Week <[EMAIL PROTECTED]> wrote:
These may help: http://opensolaris.org/os/community/dtrace/scripts/ Check out iosnoop.d http://www.solarisinternals.com/si/dtrace/index.php Check out iotrace.d - Lisa Joe Little wrote On 05/05/06 18:59,: > Are there known i/o or iscsi dtrace scripts available? > > On 5/5/06, Spencer Shepler <[EMAIL PROTECTED]> wrote: > >> On Fri, Joe Little wrote: >> > On 5/5/06, Eric Schrock <[EMAIL PROTECTED]> wrote: >> > >On Fri, May 05, 2006 at 03:46:08PM -0700, Joe Little wrote: >> > >> Thanks for the tip. In the local case, I could send to the >> > >> iSCSI-backed ZFS RAIDZ at even faster rates, with a total >> elapsed time >> > >> of 50seconds (17 seconds better than UFS). However, I didn't >> even both >> > >> finishing the NFS client test, since it was taking a few seconds >> > >> between multiple 27K files. So, it didn't help NFS at all. I'm >> > >> wondering if there is something on the NFS end that needs changing, >> > >> no? >> > > >> > >Keep in mind that turning off this flag may corrupt on-disk state >> in the >> > >event of power loss, etc. What was the delta in the local case? 17 >> > >seconds better than UFS, but percentage wise how much faster than the >> > >original? >> > > >> > >> > I believe it was only about 5-10% faster. I don't have the time >> > results off hand, just some dtrace latency reports. >> > >> > >NFS has the property that it does an enormous amount of synchronous >> > >activity, which can tickle interesting pathologies. But it's strange >> > >that it didn't help NFS that much. >> > >> > Should I also mount via async.. would this be honored on the Solaris >> > end? The other option mentioned with similar caveats was nocto. I just >> > tried with both, and the observed transfer rate was about 1.4k/s. It >> > was painful deleting the 3G directory via NFS, with about 100k/s >> > deletion rate on these 1000 files. Of course, When I went locally the >> > delete was instantaneous. >> >> I wouldn't change any of the options at the client. The issue >> is at the server side and none of the other combinations that you >> originally pointed out have this problem, right? Mount options at the >> client will just muddy the waters. >> >> We need to understand if/what the NFS/ZFS/iscsi interaction is and why >> it is so much worse. As Eric mentioned, there may be some interesting >> pathologies at play here and we need to understand what they are so >> they can be addressed. >> >> My suggestion is additional dtrace data collection but I don't have >> a specific suggestion as to how/what to track next. >> Because of the significant additional latency, I would be looking for >> big increases in the number of I/Os being generated to the iscsi backend >> as compared to the local attached case. I would also look for >> some type of serialization of I/Os that is occurring with iscsi vs. >> the local attach. >> >> Spencer >> > _______________________________________________ > nfs-discuss mailing list > [EMAIL PROTECTED]
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss