> From: edmud...@mail.bounceswoosh.org
> [mailto:edmud...@mail.bounceswoosh.org] On Behalf Of Eric D. Mudama
> 
> >Unless your drive is able to queue up a request to read every single used
> >part of the drive...  Which is larger than the command queue for any
> >reasonable drive in the world...  The point is, in order to be "optimal"
you
> >have to eliminate all those seeks, and perform sequential reads only.
The
> >only seeks you should do are to skip over unused space.
> 
> I don't think you read my whole post.  I was saying this seek
> calculation pre-processing would have to be done by the host server,
> and while not impossible, is not trivial.  Present the next 32 seeks
> to each device while the pre-processor works on the complete list of
> future seeks, and the drive will do as well as possible.

I did read that, but now I think, perhaps I misunderstand it, or you
misunderstood me?  I am thinking...  If you're just queueing up a few reads
at a time (less than infinity, or less than 99% of the pool) ...  I would
not assume that these 32 seeks are even remotely sequential....  I mean ...
32 blocks in a pool of presumably millions of blocks...  I would assume they
are essentially random, are they not?

In my mind, which is likely wrong or at least oversimplified, I think if you
want to order the list of blocks to read according to disk order (which
should at least be theoretically possible on mirrors, but perhaps not even
physically possible on raidz)...  You would have to first generate a list of
all the blocks to be read, and then sort it.  Rough estimate, for any pool
of a reasonable size, that sounds like some GB of ram to me.

Maybe there's a less-than-perfect sort algorithm which has a much lower
memory footprint?  Like a simple hashing algorithm that will guarantee the
next few thousand seeks are in disk order...  Although they will skip or
jump over many blocks that will have to be done later ... An algorithm which
is not a perfect sort, but given some repetition and multiple passes over
the disk, might achieve an acceptable level of performance versus memory
footprint...

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to