On Tue, 17 Jul 2012, Michael Hase wrote:

If you were to add a second vdev (i.e. stripe) then you should see very close to 200% due to the default round-robin scheduling of the writes.

My expectation would be > 200%, as 4 disks are involved. It may not be the perfect 4x scaling, but imho it should be (and is for a scsi system) more than half of the theoretical throughput. This is solaris or a solaris derivative, not linux ;-)

Here are some results from my own machine based on the 'virgin mount' test approach. The results show less boost than is reported by a benchmark tool like 'iozone' which sees benefits from caching.

I get an initial sequential read speed of 657 MB/s on my new pool which has 1200 MB/s of raw bandwidth (if mirrors could produce 100% boost). Reading the file a second time reports 6.9 GB/s.

The below is with a 2.6 GB test file but with a 26 GB test file (just add another zero to 'count' and wait longer) I see an initial read rate of 618 MB/s and a re-read rate of 8.2 GB/s. The raw disk can transfer 150 MB/s.

% zpool status
   pool: tank
  state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
         still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
         pool will no longer be accessible on older software versions.
   scan: scrub repaired 0 in 0h10m with 0 errors on Mon Jul 16 04:30:48 2012
config:

         NAME                          STATE     READ WRITE CKSUM
         tank                          ONLINE       0     0     0
           mirror-0                    ONLINE       0     0     0
             c7t50000393E8CA21FAd0p0   ONLINE       0     0     0
             c11t50000393D8CA34B2d0p0  ONLINE       0     0     0
           mirror-1                    ONLINE       0     0     0
             c8t50000393E8CA2066d0p0   ONLINE       0     0     0
             c12t50000393E8CA2196d0p0  ONLINE       0     0     0
           mirror-2                    ONLINE       0     0     0
             c9t50000393D8CA82A2d0p0   ONLINE       0     0     0
             c13t50000393E8CA2116d0p0  ONLINE       0     0     0
           mirror-3                    ONLINE       0     0     0
             c10t50000393D8CA59C2d0p0  ONLINE       0     0     0
             c14t50000393D8CA828Ed0p0  ONLINE       0     0     0

errors: No known data errors
% pfexec zfs create tank/zfstest
% pfexec zfs create tank/zfstest/defaults
% cd /tank/zfstest/defaults
% pfexec dd if=/dev/urandom of=random.dat bs=128k count=20000
20000+0 records in
20000+0 records out
2621440000 bytes (2.6 GB) copied, 36.8133 s, 71.2 MB/s
% cd ..
% pfexec zfs umount tank/zfstest/defaults
% pfexec zfs mount tank/zfstest/defaults
% cd defaults
% dd if=random.dat of=/dev/null bs=128k count=20000
20000+0 records in
20000+0 records out
2621440000 bytes (2.6 GB) copied, 3.99229 s, 657 MB/s
% pfexec dd if=/dev/rdsk/c7t50000393E8CA21FAd0p0 of=/dev/null bs=128k count=2000
2000+0 records in
2000+0 records out
262144000 bytes (262 MB) copied, 1.74532 s, 150 MB/s
% bc
scale=8
657/150
4.38000000

It is very difficult to benchmark with a cache which works so well:

% dd if=random.dat of=/dev/null bs=128k count=20000
20000+0 records in
20000+0 records out
2621440000 bytes (2.6 GB) copied, 0.379147 s, 6.9 GB/s

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to