Re: [zfs-discuss] remote replication with huge data using zfs?

Jeff Bonwick Thu, 11 May 2006 17:01:47 -0700

> plan A. To mirror on iSCSI devices:
>         keep one server with a set of zfs file systems
>         with 2 (sub)mirrors each, one of the mirrors use 
>         devices physically on remote site accessed as 
>         iSCSI LUNs.
> 
>         How does ZFS handle remote replication?
>         If the Internet link is down for hours or days, 
>         can the file systems still be written? Will
>         the submirrors be resync'ed efficiently?


This would work.  If the link goes down, it's no different than
if someone trips over the cable for a local disk.  All writes
(and reads) will go to the local disk until the remote one returns.

When the remote disk returns, we'll resilver it.  The cool thing here
is that ZFS resilvering is logical, not physical, so it'll only copy
the blocks that changed during the outage (i.e., it'll be fast).
For a bit more detail on how ZFS mirroring works, see:

        http://blogs.sun.com/roller/page/bonwick?entry=smokin_mirrors

The one hesitation I'd have about Plan A is that ZFS doesn't yet
support the notion of two sides of a mirror being very different in
performance.  With a local/remote pair, you really want different
semantics than a pair of local disks.  You want to send all reads
to the local disk, and you want to consider a write complete when
the local disk is done (and let the remote write be asynchronous).
We're planning to do this soon, but it's not there yet.

> Plan B. To use ZFS incremental snapshot backup/restore on a 
>         pair or servers to sync 2 copies of the same data over
>         the Internet, once say, every 10 or 60 min.

This is a better approach, for several reasons.  It will generally
be faster than remote mirroring because most 'churn' (creation and
deletion of short-lived files) will never be sent over the wire. 
It allows you to have fault tolerance like RAID-Z on the local disks.
It allows you to have arbitrarily different hardware at the local
and remote sites (e.g. you could have a SPARC system with a pool of
RAID-Z disks locally, and an Opteron system with mirrored disks
at the remote site).

Plan B is also more flexible because it's acting as a file server
rather than as a dumb LUN.  This means you can do things like have
several different sites pushing changes to a single remove server,
or arrange to have several different sites back each other up (e.g.
the LA office sends incrementals to NY, and the NY office sends
incrementals to LA).

Generating incrementals is *very* fast in ZFS.  The time it takes
to send an incremental is proportional to the amount of data changed,
no matter how much *unchanged* data there is.  Note that this is very
different than most incremental backup tools, which have to traverse
*all* of the metadata to find what's changed.  It can take hours to
discover that a single block changed.  For ZFS it's instant.

Jeff

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] remote replication with huge data using zfs?

Reply via email to