[assuming we're talking about disks and not "hardware RAID arrays"...]

On Tue, 2006-05-30 at 11:43 -0500, Anton Rang wrote:
> > Sure, the block size may be 128KB, but ZFS can bundle more than one
> > per-file/transaction
> 
> But it doesn't right now, as far as I can tell.  

The protocol overhead is still orders of magnitude faster than a
rev.  Sure, there are pathological cases such as FC-AL over
200kms with 100+ nodes, but most folks won't hurt themselves like
that.

For modern disks, multiple 128kByte transfers will spend a long time
in the disk's buffer cache waiting to be written to media.

> I never see ZFS issuing
> a 16 MB write, for instance.  You simply can't get the same performance
> from a disk array issuing 128 KB writes that you can with 16 MB writes.
> It's physically impossible because of protocol overhead, even if the
> controller itself were infinitely fast.  (There's also the issue that at
> 128 KB, most disk arrays will choose to cache rather than stream the
> data, since it's less than a single RAID stripe, which slows you down.)

Very few disks have 16MByte write buffer caches, so if you want to send
such a large iop down the wire (DAS please, otherwise you kill the SAN),
then you'll be waiting on the media anyway.  The disk interconnect is
faster than the media speed.  I don't see how you could avoid blowing
a rev in that case.  Surely there is a more generally applicable
blocksize which is appropriate.  Since many disks today do support
queued commands, I don't see the 128kByte iop as a large, inherent
limitation.  OTOH, the jury is still out...
 -- richard


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to