>>>>> "jd" == Jim Dunham <[EMAIL PROTECTED]> writes:
jd> If at the time the SNDR replica is deleted the set was
jd> actively replicating, along with ZFS actively writing to the
jd> ZFS storage pool, I/O consistency will be lost, leaving ZFS
jd> storage pool in an indeterministic state on the remote node.
jd> To address this issue, prior to deleting the replicas, the
jd> replica should be placed into logging mode first.
What if you stop the replication by breaking the network connection
between primary and replica? consistent or inconsistent?
it sounds fishy, like ``we're always-consistent-on-disk with ZFS, but
please use 'zpool offline' to avoid disastrous pool corruption.''
jd> ndr_ii. This is an automatic snapshot taken before
jd> resynchronization starts,
yeah that sounds fine, possibly better than DRBD in one way because it
might allow the resync to go faster.
From the PDF's it sounds like async replication isn't done the same
way as the resync, it's done safely, and that it's even possible for
async replication to accumulate hours of backlog in a ``disk queue''
without losing write ordering so long as you use the ``blocking mode''
variant of async.
ii might also be good for debugging a corrupt ZFS, so you can tinker
with it but still roll back to the original corrupt copy. I'll read
about it---I'm guessing I will need to prepare ahead of time if I want
ii available in the toolbox after a disaster.
jd> AVS has the concept of I/O consistency groups, where all disks
jd> of a multi-volume filesystem (ZFS, QFS) or database (Oracle,
jd> Sybase) are kept write-order consistent when using either sync
jd> or async replication.
Awesome, so long as people know to use it. so I guess that's the
answer for the OP: use consistency groups!
The one thing I worry about is, before, AVS was used between RAID and
filesystem, which is impossible now because that inter-layer area n
olonger exists. If you put the individual device members of a
redundant zpool vdev into an AVS consistency group, what will AVS do
when one of the devices fails?
Does it continue replicating the working devices and ignore the failed
one? This would sacrifice redundancy at the DR site. UFS-AVS-RAID
would not do that in the same situation.
Or hide the failed device from ZFS and slow things down by sending all
read/writes of the failed device to the remote mirror? This would
slwo down the primary site. UFS-AVS-RAID would not do that in the
same situation.
The latter ZFS-AVS behavior might be rescueable, if ZFS had the
statistical read-preference feature. but writes would still be
massively slowed with this scenario, while in UFS-AVS-RAID they would
not be. To get back the level of control one used to have for writes,
you'd need a different zpool-level way to achieve the intent of the
AVS sync/async option. Maybe just a slog which is not AVS-replicated
would be enough, modulo other ZFS fixes for hiding slow devices.
pgpzm3T09CxRc.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
