Re: [zfs-discuss] Project Proposal: Availability Suite

Frank Hofmann Mon, 05 Feb 2007 05:01:49 -0800

On Mon, 5 Feb 2007, Jim Dunham wrote:

Frank,
On Fri, 2 Feb 2007, Torrey McMahon wrote:
Jason J. W. Williams wrote:
Hi Jim,

Thank you very much for the heads up. Unfortunately, we need the
write-cache enabled for the application I was thinking of combining
this with. Sounds like SNDR and ZFS need some more soak time together
before you can use both to their full potential together?
Well...there is the fact that SNDR works with other FS other then ZFS.(Yes, I know this is the ZFS list.) Working around architectural issuesfor ZFS and ZFS alone might cause issues for others.
SNDR has some issues with logging UFS as well. If you start a SNDR livecopy on an active logging UFS (not _writelocked_), the UFS log state maynot be copied consistently.
Treading "very" carefully, UFS logging may have issues with being replicated,not the other way around. SNDR replication (after synchronizing) maintains awrite-order consistent volume, thus if there is an issue with UFS loggingbeing able to access an SNDR secondary, then UFS logging will also haveissues with accessing a volume after Solaris crashes. The end result ofSolaris crashing, or SNDR replication stopping, is a write-ordered,crash-consistent volume.

Except that you're not getting user data consistency - because UFS loggingonly does the write-ordered crash consistency for metadata.

In other words, it's possible with UFS logging to see metadata changes(file growth/shrink, filling of holes in sparse files) that do not matchthe file contents - AFTER crash recovery.

To get full consistency of data and metadata across crashes / replicationtermination, with a replicator underneath, the filesystem needs a way oftelling the replicator "and now start/stop replicating please". For thefilesystem to barrier.

I'm not saying SNDR isn't doing a good job. I'm just saying it could do aperfect job if it integrated in this way with the filesystem on top. Ifthere were 'start/stop' hooks.

II is a different matter again. It had, for some time, don't know ifthat's still true, a window where it would EIO writes when enabling theimage. Neither UFS logging nor ZFS very much like being told "thiscritical write of yours errored out".


FrankH.

Given that both UFS logging and SNDR are (near) perfect (or there would be aflood of escalations), this issue in all cases I've seen to date, is that theSNDR primary volume being replicated is mounted with UFS logging enable, butthe SNDR secondary is not mounted with UFS logging enabled. Once thiscondition happens, the problem can be resolved by fixing /etc/vfstab tocorrect the inconsistent mount options, and then performing an SNDR updatesync.
If you want a live remote replication facility, it _NEEDS_ to talk to thefilesystem somehow. There must be a callback mechanism that the filesystemcould use to tell the replicator "and from exactly now on you startreplicating". The only entity which can truly give this signal is thefilesystem itself.
There is an RFE against SNDR for something called "in-line PIT". I hope thatthis work will get done soon.
And no, that _not_ when the filesystem does a "flush write cache" ioctl. Orwhen the user has just issued a "sync" command or similar.For ZFS, it'd be when a ZIL transaction is closed (as I understand it), forUFS it'd be when the UFS log is fully rolled. There's no notification toexternal entities when these two events happen.
Because ZFS is always on-disk consistent, this is not an issue. So far in ALLmy testing with replicating ZFS with SNDR, I have not seen ZFS fail!
Of course be careful to not confuse my stated position with another closelyrelated scenario. That being accessing ZFS on the remote node via a forcedimport "zpool import -f <name>", with active SNDR replication, as ZFS issure to panic the system. ZFS, unlike other filesystems has 0% tolerance tocorrupted metadata.
Jim
SNDR tries its best to achieve this detection, but without actually_stopping_ all I/O (on UFS: writelocking), there's a window ofvulnerability still open.And SNDR/II don't stop filesystem I/O - by basic principle. That's howthey're sold/advertised/intended to be used.
I'm all willing to see SNDR/II go open - we could finally work these issues!
FrankH.
I think the best of both worlds approach would be to let SNDR plug-in toZFS along the same lines the crypto stuff will be able to plug in,different compression types, etc. There once was a slide that showed howthat worked....or I'm hallucinating again.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Project Proposal: Availability Suite

Reply via email to