Well, I should weigh in hear.

I have been using ZFS with an iscsi backend and a NFS front end to my
clients. Until B41 (not sure what fixed this) I was getting 20KB/sec
for RAIDZ and 200KB/sec for just ZFS on on large iscsi LUNs
(non-RAIDZ) when I was receiving many small writes, such as untarring
of a linux or opensolaris tree, or artificially a copy of 6250 8k
files. It turned out the NFS would issue 3 fsyncs on each write, and
my performance degraded terribly from my normal 20MB+/sec writes to
the backend iscsi storage. Now, a parallel test using NetApps shows no
performance drop, but that's because of NVRAM backed storage there,
and a test against the same iscsi targets using linux and XFS and the
NFS server implementation there gave me 1.25MB/sec writes. I was about
to throw in the towel and deem ZFS/NFS has unusable until B41 came
along and at least gave me 1.25MB/sec.

This option, with all its caveats, would be ideal on various
NFS-provided filesystems (large cache directories for cluster nodes,
tmp space from my pools, etc) to get performance characteristics
similar to a NetApp Filer. If I can provide for that stable storage to
a high degree, or have only NVRAM-based storage, this could be a big
win for ZFS, if nothing else than the RFPs that would require
benchmarks/bakeoffs against a NetApp showing it can perform just as
fast, caveats and all.


On 6/21/06, Olaf Manczak <[EMAIL PROTECTED]> wrote:
Neil,

I think it might be wise to look at this problem from the perspective
of an application (e.g. a simple database) designer taking into account
all the new things that Solaris ZFS provides.

In case of ZFS the designer does not have to worry about consistency
of the on-disk file system format but only about "has my data been
committed either to disk (or to NVRAM if there is one)". Depending on
the problem the designer tries to address it might be either the
total write throughput, in which case the designer might love the
"deferred" option, or the ability of sync file data to stable storage
and the latency of this operation. Considering flexibility of the
file system creation in ZFS I could imagine use of multiple file
systems with different mount options for different types of files.

All in all, though, the question is if a set of the POSIX calls with
the semantics defined through the mount options gives programmers
(or application designers) enough flexibility to address most common
issues in high level application scenarios a simple and productive way.
If so which of these different sync options are useful or needed.

-- Olaf

> It is similar in the sense that it speeds up the file system.
> Using fastfs can be much more dangerous though as it can lead
> to a badly corrupted file system as writing meta data is delayed
> and written out of order. Whereas disabling the ZIL does not affect
> the integrity of the fs. The transaction group model of ZFS gives
> consistency in the event of a crash/power fail. However, any data that
> was promised to be on stable storage may not be unless the transaction
> group committed (an operation that is started every 5s).
>
> We once had plans to add a mount option to allow the admin
> to control the ZIL. Here's a brief section of the RFE (6280630):
>
>         sync={deferred,standard,forced}
>
>                 Controls synchronous semantics for the dataset.
>
>                 When set to 'standard' (the default), synchronous
> operations
>                 such as fsync(3C) behave precisely as defined in
>                 fcntl.h(3HEAD).
>
>                 When set to 'deferred', requests for synchronous semantics
>                 are ignored.  However, ZFS still guarantees that ordering
>                 is preserved -- that is, consecutive operations reach
> stable
>                 storage in order.  (If a thread performs operation A
> followed
>                 by operation B, then the moment that B reaches stable
> storage,
>                 A is guaranteed to be on stable storage as well.)  ZFS also
>                 guarantees that all operations will be scheduled for
> write to
>                 stable storage within a few seconds, so that an unexpected
>                 power loss only takes the last few seconds of change
> with it.
>
>                 When set to 'forced', all operations become synchronous.
>                 No operation will return until all previous operations
>                 have been committed to stable storage.  This option can be
>                 useful if an application is found to depend on synchronous
>                 semantics without actually requesting them; otherwise, it
>                 will just make everything slow, and is not recommended.
>
> Of course we would need to stress the dangers of setting 'deferred'.
> What do you guys think?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to