On 26 May, 2010 - sensille sent me these 4,5K bytes: > Edward Ned Harvey wrote: > >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > >> boun...@opensolaris.org] On Behalf Of sensille > >> > >> The basic idea: the main problem when using a HDD as a ZIL device > >> are the cache flushes in combination with the linear write pattern > >> of the ZIL. This leads to a whole rotation of the platter after > >> each write, because after the first write returns, the head is > >> already past the sector that will be written next. > >> My idea goes as follows: don't write linearly. Track the rotation > >> and write to the position the head will hit next. This might be done > >> by a re-mapping layer or integrated into ZFS. This works only because > >> ZIL device are basically write-only. Reads from this device will be > >> horribly slow. > > > > The reason why hard drives are less effective as ZIL dedicated log devices > > compared to such things as SSD's, is because of the rotation of the hard > > drives; the physical time to seek a random block. There may be a > > possibility to use hard drives as dedicated log devices, cheaper than SSD's > > with possibly comparable latency, if you can intelligently eliminate the > > random seek. If you have a way to tell the hard drive "Write this data, to > > whatever block happens to be available at minimum seek time." > > Thanks for rephrasing my idea :) The only thing I'd like to point out is that > ZFS doesn't do random writes on a slog, but nearly linear writes. This might > even be hurting performance more than random writes, because you always hit > the worst case of one full rotation.
A simple test would be to change "write block X" "write block X+1" "write block X+2" into "write block X" "write block X+4" "write block X+8" or something, so it might manage to send the command before the head has travelled over to block X+4 etc.. I guess basically, you want to do something like TCQ/NCQ, but without the Q.. placing writes optimally.. > > So you believe you can know the drive geometry, the instantaneous head > > position, and the next available physical block address in software? No > > need for special hardware? That's cool. I hope there aren't any "gotchas" > > as-yet undiscovered. > > Yes, I already did a mapping of several drives. I measured at least the track > length, the interleave needed between two writes and the interleave if a > track-to-track seek is involved. Of course you can always learn more about a > disk, but that's a good starting point. Since X, X+1, X+2 seems to be the optimally worst case, try just skipping over a few blocks.. Double (or such) the performance for a single software tweak would be surely welcome. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss