Hi,
after some testing with ZFS I noticed that read requests are not scheduled
even to the drives but the first one gets predominately selected:
My pool is setup as follows:
NAME STATE READ WRITE CKSUM
tpc ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t1d0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t2d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t3d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t4d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t6d0 ONLINE 0 0 0
c4t6d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c1t7d0 ONLINE 0 0 0
c4t7d0 ONLINE 0 0 0
Disk I/O after doing some benchmarking:
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
tpc 7.70G 50.9G 85 21 10.5M 1.08M
mirror 1.10G 7.28G 11 3 1.47M 159K
c1t0d0 - - 10 2 1.34M 159K
c4t0d0 - - 1 2 138K 159K
mirror 1.10G 7.27G 11 3 1.48M 159K
c1t1d0 - - 10 2 1.34M 159K
c4t1d0 - - 1 2 140K 159K
mirror 1.09G 7.28G 12 3 1.50M 159K
c1t2d0 - - 10 2 1.37M 159K
c4t2d0 - - 0 2 128K 159K
mirror 1.10G 7.28G 12 3 1.53M 158K
c1t3d0 - - 11 2 1.42M 158K
c4t3d0 - - 0 2 110K 158K
mirror 1.10G 7.28G 11 3 1.44M 158K
c1t4d0 - - 10 2 1.33M 158K
c4t4d0 - - 0 2 112K 158K
mirror 1.10G 7.28G 12 3 1.53M 158K
c1t6d0 - - 11 2 1.42M 158K
c4t6d0 - - 0 2 106K 158K
mirror 1.11G 7.26G 12 3 1.55M 158K
c1t7d0 - - 11 2 1.42M 158K
c4t7d0 - - 1 2 130K 158K
---------- ----- ----- ----- ----- ----- -----
or with "iostat"
11.4 4.3 1451.1 157.1 0.0 0.3 0.4 19.6 0 17 c1t7d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t5d0
10.7 4.3 1361.4 158.4 0.0 0.3 0.4 22.1 0 18 c1t0d0
10.9 4.3 1395.7 157.9 0.0 0.3 0.4 18.6 0 16 c1t2d0
1.0 4.3 129.0 157.1 0.0 0.0 0.8 8.9 0 2 c4t7d0
0.9 4.3 112.0 156.9 0.0 0.0 0.9 9.4 0 2 c4t4d0
1.1 4.4 139.5 158.3 0.0 0.0 0.9 8.8 0 3 c4t1d0
10.6 4.3 1354.8 157.0 0.0 0.3 0.4 18.8 0 16 c1t4d0
0.9 4.3 109.2 157.3 0.0 0.1 0.9 9.7 0 3 c4t3d0
10.7 4.4 1363.4 158.3 0.0 0.3 0.4 21.9 0 18 c1t1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t8d0
1.0 4.3 127.0 157.8 0.0 0.0 0.9 9.0 0 2 c4t2d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c1t8d0
11.4 4.3 1449.9 156.9 0.0 0.3 0.4 20.0 0 17 c1t6d0
0.8 4.3 105.4 156.8 0.0 0.0 0.9 8.5 0 2 c4t6d0
11.3 4.3 1447.4 157.4 0.0 0.3 0.4 18.9 0 17 c1t3d0
1.1 4.4 137.7 158.4 0.0 0.0 0.9 8.8 0 2 c4t0d0
So you can see the second disk of each mirror pair (c4tXd0) gets almost no
I/O. How does ZFS decide from which mirror device to read?
And just another notice:
SVM does offer kstat values of type KSTAT_TYPE_IO. Why not ZFS (at least on
zpool level)?
And BTW (not ZFS related, but SVM):
With the introduction of the SVM bunnahabhain project (friendly names)
"iostat -n" output is now completely useless - even if you still use the old
naming scheme:
% iostat -n
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.3 0 0 c0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.4 0 0 c0d1
0.0 5.0 0.7 21.8 0.0 0.0 0.0 1.5 0 1 c3d0
0.0 4.1 0.6 20.9 0.0 0.0 0.0 2.8 0 1 c4d0
1.6 37.3 16.6 164.3 0.1 0.1 2.5 1.6 1 5 c2d0
1.6 37.5 16.5 164.5 0.1 0.1 3.2 1.7 1 5 c1d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
2.9 1.9 19.3 4.8 0.0 0.2 0.3 37.2 0 1 md5
0.0 0.0 0.0 0.0 0.0 0.0 0.0 19.9 0 0 md12
0.0 0.0 0.0 0.0 0.0 0.0 0.0 12.4 0 0 md13
0.0 0.0 0.0 0.0 0.0 0.0 3.9 17.7 0 0 md14
1.5 1.9 9.6 4.8 0.0 0.1 0.0 35.7 0 0 md15
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md16
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md17
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md18
1.5 1.9 9.6 4.8 0.0 0.1 0.0 27.7 0 0 md19
Instead of "mdXXX" is was expecting the following names:
% ls -lL /dev/md/dsk
Gesamt 0
brw-r----- 1 root sys 85, 5 Mai 26 00:43 d1
brw-r----- 1 root sys 85, 15 Mai 26 00:43 root-0
brw-r----- 1 root sys 85, 19 Mai 26 00:43 root-1
brw-r----- 1 root sys 85, 18 Mai 26 00:43 scratch
brw-r----- 1 root sys 85, 16 Mai 26 00:43 scratch-0
brw-r----- 1 root sys 85, 17 Mai 26 00:43 scratch-1
brw-r----- 1 root sys 85, 14 Mai 25 17:51 swap
brw-r----- 1 root sys 85, 12 Mai 26 00:43 swap-0
brw-r----- 1 root sys 85, 13 Mai 26 00:43 swap-1
Daniel
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss