Re: [zfs-discuss] Does your device honor write barriers?

Miles Nordin Tue, 10 Feb 2009 11:26:19 -0800

>>>>> "ps" == Peter Schuller <peter.schul...@infidyne.com> writes:


    ps> A test I did was to write a minimalistic program that simply
    ps> appended one block (8k in this case), fsync():ing in between,
    ps> timing each fsync().

were you the one that suggested writing backwards to make the
difference bigger?  I guess you found that trick unnecessary---speeds
differed enough when writing forwards?

    ps> * Write-back caching on the RAID controller (lowest latency).

Did you find a good way to disable this case so you could distinguish
between the second two?

like, I thought there was some type of SYNCHRONIZE CACHE with a
certain flag-bit set, which demands a flush to disk not to NVRAM, and
that years ago ZFS was mistakenly sending this overly aggressive
command instead of the normal ``just make it persistent'' sync, so
there was that stale best-practice advice to lobotomize the array by
ordering it to treat the two commands equivalent.

Maybe it would be possible to send that old SYNC command on purpose.
Then you could start the tool by comparing speeds with to-disk-SYNC
and normal-nvramallowed-SYNC: if they're the same speed and oddly
fast, then you know the array controller is lobotomized, and the
second half of the test is thus invalid.  If they're different speeds,
then you can trust the second half is actually testing the disks, so
lnog as you send old-SYNC.  If they're the same speed but slow, then
you don't have NVRAM.

    ps> you could write an ever increasing sequence of values to
    ps> deterministic but pseudo-random pages in some larger file,
    ps> such that you can, after a powerfail test, read them back in
    ps> and test the sequence of numbers (after sorting it) for the
    ps> existence of holes.

yeah, the perl script I linked to requires a ``server'' which is not
rebooted and a ``client'' which is rebooted during the test, and the
client checks in its behavior with the server.  I think the server
should be unnecessary---the script should just know itself, know in
the check phase what it would have written.  I guess the original
script author is thinking more of the SYNC comand and less of the
write barrier, but in terms of losing pools or _corrupting_ databases,
it's really only barriers that matter, and SYNC matters only because
it's also an implicit barrier, doesn't matter exactly when it returns.

so....I guess you would need the listening-server to test SYNC is not
returning early, like if you want to detect that someone has disabled
the ZIL, or if you have an n-tier database system with retries at
higher tiers or a system that's distributed or doing replication, then
you do care when SYNC returns and need the not-rebooted
listening-server.  But you should be able to make a serverless tool
just to check write barriers and thus corruption-proofness.

pgpzP7uuNeVUT.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does your device honor write barriers?

Reply via email to