On 05/26/10 07:10, sensille wrote:
Recently, I've been reading through the ZIL/slog discussion and
have the impression that a lot of folks here are (like me)
interested in getting a viable solution for a cheap, fast and
reliable ZIL device.
I think I can provide such a solution for about $200, but it
involves a lot of development work.
The basic idea: the main problem when using a HDD as a ZIL device
are the cache flushes in combination with the linear write pattern
of the ZIL. This leads to a whole rotation of the platter after
each write, because after the first write returns, the head is
already past the sector that will be written next.
My idea goes as follows: don't write linearly. Track the rotation
and write to the position the head will hit next. This might be done
by a re-mapping layer or integrated into ZFS. This works only because
ZIL device are basically write-only. Reads from this device will be
horribly slow.

I have done some testing and am quite enthusiastic. If I take a
decent SAS disk (like the Hitachi Ultrastar C10K300), I can raise
the synchronous write performance from 166 writes/s to about
2000 writes/s (!). 2000 IOPS is more than sufficient for our
production environment.

Currently I'm implementing a re-mapping driver for this. The
reason I'm writing to this list is that I'd like to find support
from the zfs team, find sparring partners to discuss implementation
details and algorithms and, most important, find testers!

If there is interest it would be great to build an official project
around it. I'd be willing to contribute most of the code, but any
help will be more than welcome.

So, anyone interested? :)

--
Arne Jansen


Yes, I agree this seems very appealing. I have investigated and
observed similar results. Just allocating larger intent log blocks but
only writing to say the first half of them has seen the same effect.
Despite the impressive results, we have not pursued this further mainly
because of it's maintainability. There is quite a variance between
drives so, as mentioned, feedback profiling of the device is needed
in the working system. The layering of the Solaris IO subsystem doesn't
provide the feedback necessary and the ZIL code is layered on the SPA/DMU.
Still it should be possible. Good luck!

Neil.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to