On Mon, 5 Feb 2007, Jim Dunham wrote:
Frank,
On Fri, 2 Feb 2007, Torrey McMahon wrote:
Jason J. W. Williams wrote:
Hi Jim,
Thank you very much for the heads up. Unfortunately, we need the
write-cache enabled for the application I was thinking of combining
this with. Sounds like SNDR and ZFS need some more soak time together
before you can use both to their full potential together?
Well...there is the fact that SNDR works with other FS other then ZFS.
(Yes, I know this is the ZFS list.) Working around architectural issues
for ZFS and ZFS alone might cause issues for others.
SNDR has some issues with logging UFS as well. If you start a SNDR live
copy on an active logging UFS (not _writelocked_), the UFS log state may
not be copied consistently.
Treading "very" carefully, UFS logging may have issues with being replicated,
not the other way around. SNDR replication (after synchronizing) maintains a
write-order consistent volume, thus if there is an issue with UFS logging
being able to access an SNDR secondary, then UFS logging will also have
issues with accessing a volume after Solaris crashes. The end result of
Solaris crashing, or SNDR replication stopping, is a write-ordered,
crash-consistent volume.
Except that you're not getting user data consistency - because UFS logging
only does the write-ordered crash consistency for metadata.
In other words, it's possible with UFS logging to see metadata changes
(file growth/shrink, filling of holes in sparse files) that do not match
the file contents - AFTER crash recovery.
To get full consistency of data and metadata across crashes / replication
termination, with a replicator underneath, the filesystem needs a way of
telling the replicator "and now start/stop replicating please". For the
filesystem to barrier.
I'm not saying SNDR isn't doing a good job. I'm just saying it could do a
perfect job if it integrated in this way with the filesystem on top. If
there were 'start/stop' hooks.
II is a different matter again. It had, for some time, don't know if
that's still true, a window where it would EIO writes when enabling the
image. Neither UFS logging nor ZFS very much like being told "this
critical write of yours errored out".
FrankH.
Given that both UFS logging and SNDR are (near) perfect (or there would be a
flood of escalations), this issue in all cases I've seen to date, is that the
SNDR primary volume being replicated is mounted with UFS logging enable, but
the SNDR secondary is not mounted with UFS logging enabled. Once this
condition happens, the problem can be resolved by fixing /etc/vfstab to
correct the inconsistent mount options, and then performing an SNDR update
sync.
If you want a live remote replication facility, it _NEEDS_ to talk to the
filesystem somehow. There must be a callback mechanism that the filesystem
could use to tell the replicator "and from exactly now on you start
replicating". The only entity which can truly give this signal is the
filesystem itself.
There is an RFE against SNDR for something called "in-line PIT". I hope that
this work will get done soon.
And no, that _not_ when the filesystem does a "flush write cache" ioctl. Or
when the user has just issued a "sync" command or similar.
For ZFS, it'd be when a ZIL transaction is closed (as I understand it), for
UFS it'd be when the UFS log is fully rolled. There's no notification to
external entities when these two events happen.
Because ZFS is always on-disk consistent, this is not an issue. So far in ALL
my testing with replicating ZFS with SNDR, I have not seen ZFS fail!
Of course be careful to not confuse my stated position with another closely
related scenario. That being accessing ZFS on the remote node via a forced
import "zpool import -f <name>", with active SNDR replication, as ZFS is
sure to panic the system. ZFS, unlike other filesystems has 0% tolerance to
corrupted metadata.
Jim
SNDR tries its best to achieve this detection, but without actually
_stopping_ all I/O (on UFS: writelocking), there's a window of
vulnerability still open.
And SNDR/II don't stop filesystem I/O - by basic principle. That's how
they're sold/advertised/intended to be used.
I'm all willing to see SNDR/II go open - we could finally work these issues
!
FrankH.
I think the best of both worlds approach would be to let SNDR plug-in to
ZFS along the same lines the crypto stuff will be able to plug in,
different compression types, etc. There once was a slide that showed how
that worked....or I'm hallucinating again.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss