The short answer is "no". The longer answer is "it depends". The most
concise discussion I've seen is Inktank's Multi-site option whitepaper:
http://info.inktank.com/multisite_options_with_inktank_ceph_enterprise
That white paper only addresses RBD backups (using snapshots) and
RadosGW backups (using RadosGW replication). The first option in the
whitepaper, a single cluster in multiple location, isn't a backup.
I'm not aware of any backup or offsite capability for raw RADOS pools.
There really aren't any good options for backing up CephFS. You could
use rsync on CephFS, but it's not going to work well. rsync to offsite
locations begins to have problems around the TB size, give or take an
order of magnitude. The exact spot depends on your bandwidth, latency,
file count, average file size, average file churn, and Disk I/O on both
sides. It takes a lot of time and Disk I/O to enumerate all the files
on the filesystem, and compare them to the offsite copy. CephFS does
have some nice features that could make for an efficient backup. If
rsync (or any backup client) was aware of the way CephFS handles
directory size and timestamp, it could prune the directory tree
enumeration much more efficiently. That should scale well to much
larger file systems, mostly limited by file churn and churn locality. I
don't know of anybody that's working on that. I'm interested in the
concept, but I have no plans (personal or professional) to use CephFS.
I'm currently working on adding Snapshot capabilities to RadosGW.
Combined with replication, it can protect against disasters, PEBKAC, and
application error. Replication alone only protects against disasters,
but not PEBKAC nor application errors. Just like RAID protects against
disk failure, but not file deletion.
Replication + Snapshots (for both RadosGW and RBD) don't protect against
a determined attacker. Even tape is vulnerable to a determined attacker
with a high security level in your organization. The trick with both
offline backups and remote snapshots is to set up enough barriers and
checks that things get noticed before a determined attacker can finish
the job. It's easier to do with offline backups than online backups.
*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>
*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter
<http://www.twitter.com/centraldesktop> | Facebook
<http://www.facebook.com/CentralDesktop> | LinkedIn
<http://www.linkedin.com/groups?gid=147417> | Blog
<http://cdblog.centraldesktop.com/>
On 4/2/14 00:08 , Robert Sander wrote:
Hi,
what are the options to consistently backup and restore
data out of a ceph cluster?
- RBDs can be snapshotted.
- Data on RBDs used inside VMs can be backed up using tools from the guest.
- CephFS data can be backed up using rsync are similar tools
What about object data in other pools?
There are two scenarios where a backup is needed:
- disaster recovery, i.e. the while cluster goes nuts
- single item restore, because PEBKAC or application error
Is there any work on progress to cover these?
Regards
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com