On Fri, Joe Little wrote:
> well, it was already an NFS-discuss list message. Someone else added
> dtrace-discuss to it. I have already noted this to a degree on zfs-discuss,
> but it seems to be mainly a NFS specific issue at this stage.

So I took your original data you posted and reformatted it to
make it a little easier to compare.

The first table, the overall operation counts, shows that the client
is generally sending the same number of requests.  Ignoring the
FSINFO, NULL, FSSTAT numbers now given that they are a data collection
anamoly.

The numbers are pretty much exact except for GETATTR when
the combination if NFS/ZFS.  That is a little odd and may be
something to look at out of curiousity but given the average
response times in the second table for GETATTR, it shouldn't
be an issue.

So the client is sending the same number of ops.  Good
to confirm that.


NFS3 op counts           UFS            ZFS (non-iscsi) ZFS (iscsi)
===================================================================
RFS3_FSINFO              -              -               1
RFS3_NULL                -              -               1
RFS3_FSSTAT              -              -               4
RFS3_SYMLINK             5              5               5    
RFS3_MKDIR               73             73              73   
RFS3_COMMIT              885            885             885  
RFS3_CREATE              901            901             901  
RFS3_RENAME              901            901             901  
RFS3_ACCESS              1863           1884            1958 
RFS3_SETATTR             3792           3792            3792 
RFS3_LOOKUP              3873           3945            3963 
RFS3_GETATTR             7568           8479            8514 
RFS3_WRITE               46844          46844           46844


This second table is the most interesting one.

It is easy to see that when comparing the ZFS non-iscsi and iscsi
columns that the metadata operations: CREATE, MKDIR, RENAME, SETATTR,
SYMLINK are on average an order of magnitude great in response time.
More at the end on why that is the source of the "throughput" issues
you are observing.

NFS3 avg rsp time (usec)
                         UFS            ZFS (non-iscsi) ZFS (iscsi)
===================================================================
RFS3_ACCESS              9              10              13    
RFS3_COMMIT              13521          72752           273027
RFS3_CREATE              996            14806           156186
RFS3_FSINFO              -              -               19043
RFS3_FSSTAT              -              -               18
RFS3_GETATTR             8              8               14    
RFS3_LOOKUP              10             13              24    
RFS3_MKDIR               4589           16591           223285
RFS3_NULL                -              -               253   
RFS3_RENAME              222            14780           146412
RFS3_SETATTR             559            13785           138609
RFS3_SYMLINK             344            16018           118492
RFS3_WRITE               9409           82              9931  


And the system time measurements aren't too interesting 
in that the extra overhead is unlikely to be the biggest issue
at this point.

NFS3 op avg sys time (usec)
                         UFS            ZFS (non-iscsi) ZFS (iscsi)
===================================================================
RFS3_ACCESS              6              8               11  
RFS3_COMMIT              112            5069            6256
RFS3_CREATE              72             63              105 
RFS3_FSINFO              -              -               118
RFS3_FSSTAT              -              -               15
RFS3_GETATTR             6              7               12  
RFS3_LOOKUP              8              9               13 
RFS3_MKDIR               105            195             387
RFS3_NULL                -              -               14 
RFS3_RENAME              77             67              112
RFS3_SETATTR             32             51              96 
RFS3_SYMLINK             96             53              84 
RFS3_WRITE               125            80              91 

------


So the average response time table is the key.  With the workload
you described (creating directories and files of "smaller" size)
the response time of the operations like CREATE, MKDIR, RENAME, SETATTR
and SYMLINK will become key.  The NFS protocol requires that for
these operations (and COMMIT for that matter), the associated
file system metadata must be on stable storage before the server
can respond that the operation is complete.

Therefore, the order of magnitude greater response times for those
operations will slow the overall throughput to what you are observing.

With the data that you provided for the UFS/iscsi config there is
additional response time being added because of the iscsi backend
and that would be expected.  However, I don't have a good explanation
for why the ZFS/iscsi combination is so much more extreme than
the UFS/iscsi combination.

I do know why local access to the ZFS/iscsi config is faster and
that is because for local access the requirement for meta-data
update is not present as it is for the NFS server so there is
a lot of disk updates that can be effectively aggregated and postponed.

Spencer
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to