On Mon, Feb 7, 2011 at 3:47 PM, Nico Williams <n...@cryptonector.com> wrote:
> On Mon, Feb 7, 2011 at 2:39 PM, Yi Zhang <yizhan...@gmail.com> wrote:
>> On Mon, Feb 7, 2011 at 2:54 PM, Nico Williams <n...@cryptonector.com> wrote:
>>> ZFS cannot not buffer.  The reason is that ZFS likes to batch transactions 
>>> into
>>> as large a contiguous write to disk as possible.  The ZIL exists to
>>> support fsyn(2)
>>> operations that must commit before the rest of a ZFS transaction.  In
>>> other words:
>>> there's always some amount of buffering of writes in ZFS.
>> In that case, ZFS doesn't suit my needs.
>
> Maybe.  See below.
>
>>> As to read buffering, why would you want to disable those?
>> My application manages its own buffer and reads/writes go through that
>> buffer first. I don't want double buffering.
>
> So your concern is that you don't want to pay twice the memory cost
> for buffering?
>
> If so, set primarycache as described earlier and drop the O_DSYNC flag.
>
> ZFS will then buffer your writes, but only for a little while, and you
> should want it to
> because ZFS will almost certainly do a better job of batching transactions 
> than
> your application would.  With ZFS you'll benefit from: advanced volume
> management,
> snapshots/clones, dedup, Merkle hash trees (i.e., corruption
> detection), encryption,
> and so on.  You'll almost certainly not be implementing any of those
> in your application...
>
>>> You still haven't told us what your application does.  Or why you want
>>> to get close
>>> to the metal.  Simply telling us that you need "no buffering" doesn't
>>> really help us
>>> help you -- with that approach you'll simply end up believing that ZFS is 
>>> not
>>> appropriate for your needs, even though it well might be.
>> It's like the Berkeley DB on a high level, though it doesn't require
>> transaction support, durability, etc. I'm measuring its performance
>> and don't want FS buffer to pollute my results (hence directio).
>
> You're still mixing directio and O_DSYNC.
>
> You should do three things: a) set primarycache=metadata, b) set recordsize to
> whatever your application's page size is (e.g., 8KB), c) stop using O_DSYNC.
>
> Tell us how that goes.  I suspect the performance will be much better.
>
> Nico
> --
>

This is actually what I did for 2.a) in my original post. My concern
there is that ZFS' internal write buffering makes it hard to get a
grip on my application's behavior. I want to present my application's
"raw" I/O performance without too much outside factors... UFS plus
directio gives me exactly (or close to) that but ZFS doesn't...

Of course, in the final deployment, it would be great to be able to
take advantage of ZFS' advanced features such as I/O optimization.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to