On Mon, 5 Feb 2007, Jim Dunham wrote:

Frank,
On Fri, 2 Feb 2007, Torrey McMahon wrote:

Jason J. W. Williams wrote:
Hi Jim,

Thank you very much for the heads up. Unfortunately, we need the
write-cache enabled for the application I was thinking of combining
this with. Sounds like SNDR and ZFS need some more soak time together
before you can use both to their full potential together?

Well...there is the fact that SNDR works with other FS other then ZFS. (Yes, I know this is the ZFS list.) Working around architectural issues for ZFS and ZFS alone might cause issues for others.

SNDR has some issues with logging UFS as well. If you start a SNDR live copy on an active logging UFS (not _writelocked_), the UFS log state may not be copied consistently.

Treading "very" carefully, UFS logging may have issues with being replicated, not the other way around. SNDR replication (after synchronizing) maintains a write-order consistent volume, thus if there is an issue with UFS logging being able to access an SNDR secondary, then UFS logging will also have issues with accessing a volume after Solaris crashes. The end result of Solaris crashing, or SNDR replication stopping, is a write-ordered, crash-consistent volume.

Except that you're not getting user data consistency - because UFS logging only does the write-ordered crash consistency for metadata.

In other words, it's possible with UFS logging to see metadata changes (file growth/shrink, filling of holes in sparse files) that do not match the file contents - AFTER crash recovery.

To get full consistency of data and metadata across crashes / replication termination, with a replicator underneath, the filesystem needs a way of telling the replicator "and now start/stop replicating please". For the filesystem to barrier.

I'm not saying SNDR isn't doing a good job. I'm just saying it could do a perfect job if it integrated in this way with the filesystem on top. If there were 'start/stop' hooks.

II is a different matter again. It had, for some time, don't know if that's still true, a window where it would EIO writes when enabling the image. Neither UFS logging nor ZFS very much like being told "this critical write of yours errored out".

FrankH.


Given that both UFS logging and SNDR are (near) perfect (or there would be a flood of escalations), this issue in all cases I've seen to date, is that the SNDR primary volume being replicated is mounted with UFS logging enable, but the SNDR secondary is not mounted with UFS logging enabled. Once this condition happens, the problem can be resolved by fixing /etc/vfstab to correct the inconsistent mount options, and then performing an SNDR update sync.


If you want a live remote replication facility, it _NEEDS_ to talk to the filesystem somehow. There must be a callback mechanism that the filesystem could use to tell the replicator "and from exactly now on you start replicating". The only entity which can truly give this signal is the filesystem itself.

There is an RFE against SNDR for something called "in-line PIT". I hope that this work will get done soon.


And no, that _not_ when the filesystem does a "flush write cache" ioctl. Or when the user has just issued a "sync" command or similar. For ZFS, it'd be when a ZIL transaction is closed (as I understand it), for UFS it'd be when the UFS log is fully rolled. There's no notification to external entities when these two events happen.

Because ZFS is always on-disk consistent, this is not an issue. So far in ALL my testing with replicating ZFS with SNDR, I have not seen ZFS fail!

Of course be careful to not confuse my stated position with another closely related scenario. That being accessing ZFS on the remote node via a forced import "zpool import -f <name>", with active SNDR replication, as ZFS is sure to panic the system. ZFS, unlike other filesystems has 0% tolerance to corrupted metadata.

Jim


SNDR tries its best to achieve this detection, but without actually _stopping_ all I/O (on UFS: writelocking), there's a window of vulnerability still open. And SNDR/II don't stop filesystem I/O - by basic principle. That's how they're sold/advertised/intended to be used.

I'm all willing to see SNDR/II go open - we could finally work these issues !

FrankH.


I think the best of both worlds approach would be to let SNDR plug-in to ZFS along the same lines the crypto stuff will be able to plug in, different compression types, etc. There once was a slide that showed how that worked....or I'm hallucinating again.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to