On Fri, Joe Little wrote: > well, it was already an NFS-discuss list message. Someone else added > dtrace-discuss to it. I have already noted this to a degree on zfs-discuss, > but it seems to be mainly a NFS specific issue at this stage.
So I took your original data you posted and reformatted it to make it a little easier to compare. The first table, the overall operation counts, shows that the client is generally sending the same number of requests. Ignoring the FSINFO, NULL, FSSTAT numbers now given that they are a data collection anamoly. The numbers are pretty much exact except for GETATTR when the combination if NFS/ZFS. That is a little odd and may be something to look at out of curiousity but given the average response times in the second table for GETATTR, it shouldn't be an issue. So the client is sending the same number of ops. Good to confirm that. NFS3 op counts UFS ZFS (non-iscsi) ZFS (iscsi) =================================================================== RFS3_FSINFO - - 1 RFS3_NULL - - 1 RFS3_FSSTAT - - 4 RFS3_SYMLINK 5 5 5 RFS3_MKDIR 73 73 73 RFS3_COMMIT 885 885 885 RFS3_CREATE 901 901 901 RFS3_RENAME 901 901 901 RFS3_ACCESS 1863 1884 1958 RFS3_SETATTR 3792 3792 3792 RFS3_LOOKUP 3873 3945 3963 RFS3_GETATTR 7568 8479 8514 RFS3_WRITE 46844 46844 46844 This second table is the most interesting one. It is easy to see that when comparing the ZFS non-iscsi and iscsi columns that the metadata operations: CREATE, MKDIR, RENAME, SETATTR, SYMLINK are on average an order of magnitude great in response time. More at the end on why that is the source of the "throughput" issues you are observing. NFS3 avg rsp time (usec) UFS ZFS (non-iscsi) ZFS (iscsi) =================================================================== RFS3_ACCESS 9 10 13 RFS3_COMMIT 13521 72752 273027 RFS3_CREATE 996 14806 156186 RFS3_FSINFO - - 19043 RFS3_FSSTAT - - 18 RFS3_GETATTR 8 8 14 RFS3_LOOKUP 10 13 24 RFS3_MKDIR 4589 16591 223285 RFS3_NULL - - 253 RFS3_RENAME 222 14780 146412 RFS3_SETATTR 559 13785 138609 RFS3_SYMLINK 344 16018 118492 RFS3_WRITE 9409 82 9931 And the system time measurements aren't too interesting in that the extra overhead is unlikely to be the biggest issue at this point. NFS3 op avg sys time (usec) UFS ZFS (non-iscsi) ZFS (iscsi) =================================================================== RFS3_ACCESS 6 8 11 RFS3_COMMIT 112 5069 6256 RFS3_CREATE 72 63 105 RFS3_FSINFO - - 118 RFS3_FSSTAT - - 15 RFS3_GETATTR 6 7 12 RFS3_LOOKUP 8 9 13 RFS3_MKDIR 105 195 387 RFS3_NULL - - 14 RFS3_RENAME 77 67 112 RFS3_SETATTR 32 51 96 RFS3_SYMLINK 96 53 84 RFS3_WRITE 125 80 91 ------ So the average response time table is the key. With the workload you described (creating directories and files of "smaller" size) the response time of the operations like CREATE, MKDIR, RENAME, SETATTR and SYMLINK will become key. The NFS protocol requires that for these operations (and COMMIT for that matter), the associated file system metadata must be on stable storage before the server can respond that the operation is complete. Therefore, the order of magnitude greater response times for those operations will slow the overall throughput to what you are observing. With the data that you provided for the UFS/iscsi config there is additional response time being added because of the iscsi backend and that would be expected. However, I don't have a good explanation for why the ZFS/iscsi combination is so much more extreme than the UFS/iscsi combination. I do know why local access to the ZFS/iscsi config is faster and that is because for local access the requirement for meta-data update is not present as it is for the NFS server so there is a lot of disk updates that can be effectively aggregated and postponed. Spencer _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss