Re: [zfs-discuss] creating a fast ZIL device for $200

Tomas Ögren Wed, 26 May 2010 07:47:59 -0700

On 26 May, 2010 - sensille sent me these 4,5K bytes:

> Edward Ned Harvey wrote:
> >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> >> boun...@opensolaris.org] On Behalf Of sensille
> >>
> >> The basic idea: the main problem when using a HDD as a ZIL device
> >> are the cache flushes in combination with the linear write pattern
> >> of the ZIL. This leads to a whole rotation of the platter after
> >> each write, because after the first write returns, the head is
> >> already past the sector that will be written next.
> >> My idea goes as follows: don't write linearly. Track the rotation
> >> and write to the position the head will hit next. This might be done
> >> by a re-mapping layer or integrated into ZFS. This works only because
> >> ZIL device are basically write-only. Reads from this device will be
> >> horribly slow.
> > 
> > The reason why hard drives are less effective as ZIL dedicated log devices
> > compared to such things as SSD's, is because of the rotation of the hard
> > drives; the physical time to seek a random block.  There may be a
> > possibility to use hard drives as dedicated log devices, cheaper than SSD's
> > with possibly comparable latency, if you can intelligently eliminate the
> > random seek.  If you have a way to tell the hard drive "Write this data, to
> > whatever block happens to be available at minimum seek time."
> 
> Thanks for rephrasing my idea :) The only thing I'd like to point out is that
> ZFS doesn't do random writes on a slog, but nearly linear writes. This might
> even be hurting performance more than random writes, because you always hit
> the worst case of one full rotation.


A simple test would be to change "write block X" "write block X+1"
"write block X+2" into  "write block X" "write block X+4" "write block
X+8" or something, so it might manage to send the command before the
head has travelled over to block X+4 etc..

I guess basically, you want to do something like TCQ/NCQ, but without
the Q.. placing writes optimally..

> > So you believe you can know the drive geometry, the instantaneous head
> > position, and the next available physical block address in software?  No
> > need for special hardware?  That's cool.  I hope there aren't any "gotchas"
> > as-yet undiscovered.
> 
> Yes, I already did a mapping of several drives. I measured at least the track
> length, the interleave needed between two writes and the interleave if a
> track-to-track seek is involved. Of course you can always learn more about a
> disk, but that's a good starting point.

Since X, X+1, X+2 seems to be the optimally worst case, try just
skipping over a few blocks.. Double (or such) the performance for a
single software tweak would be surely welcome.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] creating a fast ZIL device for $200

Reply via email to