2012-03-21 21:40, Marion Hakanson цкщеу:
Small, random read performance does not scale with the number of drives in each raidz[123] vdev because of the dynamic striping. In order to read a single logical block, ZFS has to read all the segments of that logical block, which have been spread out across multiple drives, in order to validate the checksum before returning that logical block to the application. This is why a single vdev's random-read performance is equivalent to the random-read performance of a single drive.
True, but if the stars align so nicely that all the sectors related to the block are read simultaneously in parallel from several drives of the top-level vdev, so there is no (substantial) *latency* incurred by waiting between the first and last drives to complete the read request, then the *aggregate bandwidth* of the array is (should be) similar to performance (bandwidth) of a stripe. This gain would probably be hidden by caches and averages, unless the stars align so nicely for many blocks in a row, such as a sequential uninterrupted read of a file written out sequentially - so that component drives would stream it off the platter track by track in a row... Ah, what a wonderful world that would be! ;) Also, after the sector is read by the disk and passed to the OS, it is supposedly cached until all sectors of the block arrive into the cache and the checksum matches. During this time the HDD is available to do other queued mechanical tasks. I am not sure which cache that might be: too early for ARC - no block yet, and the vdev-caches now drop non-metadata sectors. Perhaps it is just a variable buffer space in the instance of the reading routine which tries to gather all pieces of the block together and pass it to the reader (and into ARC)... //Jim _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss