Hello Anton,

Friday, April 20, 2007, 3:54:52 PM, you wrote:

ABR> To clarify, there are at least two issues with remote
ABR> replication vs. backups in my mind. (Feel free to joke about the state of 
my mind!  ;-)

ABR> The first, which as you point out can be alleviated with
ABR> snapshots, is the ability to "go back" in time. If an accident
ABR> wipes out a file, the missing file will shortly be deleted on the
ABR> remote end. Snapshots help you here ... as long as you can keep
ABR> sufficient space online. If your turnover is 1 TB/day and you
ABR> require the ability to go back to the end of any week in the past year, 
that's 52 TB.

Really depends. With ZFS snapshots in order to consume 1TB by snapshot
you would have deleted 1TB of files or make 1TB modification to files
(or both with 1TB in SUM). There certainly are such workload.
But if you just put new data (append to files, or write new files)
then snapshots practically won't consume any storage. In that case it
works perfectly.


ABR> The second is protection against file system failures. If a bug
ABR> in file system code, or damage to the metadata structures on
ABR> disk, results in the master being unreadable, then it could
ABR> easily be replicated to the remote system. (Consider a bug which
ABR> manifests itself only when 10^9 files have been created; both
ABR> file systems will shortly fail.) Keeping backups in a file system
ABR> independent
ABR> ABR>  manner (e.g. tar format, netbackup format, etc.) protects against 
this.

Lets say I agree. :)


ABR> If you're not concerned about the latter, and you can afford to
ABR> keep all of your backups on rotating rust (and have sufficient
ABR> CPU & I/O bandwidth at the remote site to scrub those backups),
ABR> and have sufficient bandwidth to actually move data between sites
ABR> (for 1 TB/day, assuming continuous modification, that's 11
ABR> MB/second if data is never rewritten during the day, or
ABR> potentially much more in a real environment) then remote replication could 
work.

You need exactly the same bandwidth as with any other classical backup
solution - it doesn't matter how at the end you need to copy all those
data (differential) out of the box regardless if it's a tape or a
disk.

However instead of doing backup during the night, which you want to do
so there will be limited impact on production performance, with
replication you can do it continuously 24x7. The actual performance
impact will be minimal as you should get most data from memory without
touching much of disks on sending side. That also means you actually
need much less throughput available to remote side. Also with frequent
enough snapshoting you have your backup basically every 30 minutes or
every one hour.


-- 
Best regards,
 Robert                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to