On Aug 23, 2009, at 12:11 AM, Tristan Ball <tristan.b...@leica-microsystems.com > wrote:



Ross Walker wrote:

[snip]



We turned up our X4540s, and this same tar unpack took over 17 minutes! We disabled the ZIL for testing, and we dropped this to under 1 minute. With the X25-E as a slog, we were able to run this test in 2-4 minutes, same as the old storage.

That's pretty impressive. So with a X25-E slog ZFS is as fast synchronously as your previously hardware was asynchronously - but with no risk of data corruption. Of course the hardware is different so it's not really apples to apples.

There was a thread not too along ago either on the xfs mailing list or mysql mailing list that talked about the Intel X25-E and it's on board cache. The cache ignores flushes, but isn't persistent on power failure. Pulling the drive during a sync write caused data corruption. You can disable the write back cache of these, but the performance is no where near as good with it disabled.

Here is the blog post:

http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/

-Ross

Hang on, in reading that his initial results were 50 writes a second, with the default xfs write barriers, which to me implies that the drive is honouring the cache flush. The fact that write rate jumps so significantly when he turns off barriers, but continues with ODIRECT and innodb_flush_log_at_trx_commit=1 to me just says that xfs is returning success on writes as soon as the data has been given to the drive - not when the drive has flushed it's cache to have it persistent. Given that we told xfs to turn off write barriers - isn't it doing what it's told? Why are we expecting data to be consistent across power loss or device removal?

Couldn't this just be XFS only actually requesting cache flushes when barrier's are enabled?

I think it's more an illustration that write barriers on Linux need a little work, even with flushes it should do a lot better then 50 IOPS.

O_DIRECT does just that, with or without barriers, it flushes on each write, with an ever so slight delay to allow the queue to coalesce writes.

A barrier is more to enforce order and persistence when IO is async.

-Ross

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to