Re: [zfs-discuss] creating a fast ZIL device for $200

sensille Thu, 27 May 2010 10:45:40 -0700

Edward Ned Harvey wrote:
>> From: sensille [mailto:sensi...@gmx.net]
>>
>> The only thing I'd like to point out
>> is that
>> ZFS doesn't do random writes on a slog, but nearly linear writes. This
>> might
>> even be hurting performance more than random writes, because you always
>> hit
>> the worst case of one full rotation.
> 
> Um ... I certainly have a doubt about this.  My understanding is that hard
> disks are already optimized for sustained sequential throughput.  I have a
> really hard time believing Seagate, WD, etc, designed their drives such that
> you read/write one track, then pause and wait for a full rotation, then
> read/write one track, and wait again, and so forth.  This would limit the
> drive to approx 50% duty cycle, and the market is very competitive.
> 
> Yes, I am really quite sure, without any knowledge at all, that the drive
> mfgrs are intelligent enough to map the logical blocks in such a way that
> sequential reads/writes which are larger than a single track will not suffer
> such a huge penalty.  Just a small penalty to jump up one track, and wait
> for a few degrees of rotation, not 360 degrees.


I'm afraid you got me wrong here. Of course the drives are optimized for
sequential reads/writes. If you give the drive a single read or write that
is larger than one track the drive acts exactly as you described. The same
holds if you give the drive multiple smaller consecutive reads/writes in
advance (NCQ/TCQ) so that the drive can coagulate them to one big op.

But this is not what happens in case of ZFS/ZIL with a single application.
The application requests a synchronous op. This request goes down into
ZFS, which in turn allocates a ZIL block, writes it to the disk and issues a
cache flush. Only after the cache flush completes, ZFS can acknowledge the
op to the application. Now the application can issue the next op, for which
ZFS will again allocate ZIL block, probably immediately after the previous
one. It writes the block and issues a flush. But in the meantime the head
has traveled some sectors down the track. To physically write the block the
drive has of course to wait until the sector is under the head again, which
means waiting nearly one full rotation. If ZFS would have chosen a block
appropriately further down the track the possibility would have been high
that the head had not passed it and could write without a big rotational
delay.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] creating a fast ZIL device for $200

Reply via email to