--On Tuesday, September 29, 2015 01:14:39 PM +0200 Benny Lofgren
<bl-li...@lofgren.biz> wrote:
However, even with mirrored drives, IT IS NOT A BACKUP. What if there is
a fire? What if someone burglars your house and steals the server? What
if someone accidentally knocks it over and all disks in it are damaged
by G-force overload? As Stuart says, mirroring is redundancy for
OPERATION, not for backup. In other words, if your system is mirrored
your server won't go offline if one disk dies on you, but will give you
time to replace the drive and re-mirror it before the other one goes too.
If you want to set up a combined home server/backup solution, it would
in fact be better *not* to mirror your two drives, but to use one for
your server needs and the other for backing up what's on the first.
To the OP, while most of the advice on this thread has been good, I'd
be careful of that one. *Keep* your drives in a mirrored configuration
and have *additional* disks for backup purposes.
Over the years, even with buying what appeared to be the most robust
commodity drives available at any given time, I've had far instances
of failed drives than having to restore from backup due to accidental
deletion, etc. But because those disks have always been part of a
redundant RAID configration, all a bad disk means is a few minutes
to replace the faulty disk and then the server is back in operation
without any loss. (Of course, it may take a large number of hours after
that for the resilvering to complete, depending on disk size and
RAID type, but the server is operational in the interim.)
Yes, the additional disks cost more but in the scheme of things disks
are cheap and this should be part of your initial budget.
THEN look at a backup solution, too. One with geographical redundancy,
which is absolutely crucial. That is, somewhere else but in your home.
But don't just do one without the other, because it WILL make you sorry
in the end.
Absolutely correct.
Deciding how to do backups is always a question of balancing things,
including but not limited to:
- if the worst happens, how much data can you afford to lose? A day?
A week? A month? (You can take the answer to be less than a day,
right down to "none", but you're talking about progressively more
complex and expensive. Even large corporations don't use "none"
for most of their data, if any.)
- cost of disks
- availability / cost of network bandwidth
- level of automation (the less automation, the more disciplined
you need to be in keeping backups current)
Let's say you low-ball this. Assume that if something bad happens, you're
willing to live with losing everything you did in the last month, and
if there was something you deleted by accident more than two months ago
you willing to say it's gone forever. Let's also assume that the amount
of data that you're backing up is not more than the size of the largest
hard drive you can currently buy. In that case, the ABSOLUTE MINIMUM
you're looking at for backups is four disks:
- Disks 1 and 2 are in a mirrored RAID in your file server
- Disks 3 and 4 are for backups
- Each month, take a snapshot of what is on your fileserver. The
first month it goes to disk 3 (bonus points for encrypting
disk 3). As soon as your backup is complete, take disk 3 off-site
(such as to your office, to a safety deposit box, etc. Note that
smaller safety deposit boxes may be too small for 3.5" drives).
Ideally your off-site is far enough away so that when the
tornado hits your house while everyone is away at work/school,
that your offsite isn't destroyed as well. This becomes more
difficult if you're in an earthquake zone.
- The second month you repeat the process with disk 4, *before*
moving disk 3. When your backup to disk 4 is complete, take it
offsite and bring disk 3 back, ready to use for next month. If
you find that you have to recover something from disk 4 during
the next month, return disk 3 to the offsite location before you
bring back disk 4.
- Periodically do a test recovery of some of your files (into a
temporary directory) to ensure that the backups are actually
usable. The first time you do this should be after your first-ever
backup.
This takes discipline; you need to remember to do this on a regular
basis, or at some point you'll find that your only available backup
is three years old and you've lost precious pictures of your kids'
early years.
Probably your best tools for doing this basic level are dump(8) and
restore(8), using a level zero dump. Read the man pages. Bonus points
for scripting them so that you get the correct invocation every time.
Remember, that's the MINIMUM strategy. Doing a web search for "data backup
strategies" will give you more background information. Some links
that (with a quick skim) seem to provide a reasonable background are:
<https://en.wikipedia.org/wiki/Backup>
<http://katiefloyd.com/blog/creating-a-comprehensive-backup-strategy>
<http://www.hanselman.com/blog/TheComputerBackupRuleOfThree.aspx>
Slightly better than the above minimum is to add additional disks
to the rotation schedule to give you 6 months' or a year of history.
Or to take one disk out of rotation periodically for a long term
archival copy.
An alternate minimum strategy if you don't have too much data, it
doesn't change by much over a given period of time, and you have
sufficient upstream network bandwidth is to use a network backup
service, but make sure that the data gets encrypted *before* you
send it upstream, and make sure you know how to do a recovery and
how long it will take. (Some providers will courier you a disk
during a disaster recovery scenario.) See
<https://en.wikipedia.org/wiki/Remote_backup_service>
With extra resources other options open up; incremental backups
so that you might lose only a day or an hour of data. Complete
automation so that you don't have to remember to do the backup
and take things offsite. Backing up more machines. Backing up
more data than will fit on a single disk. Doing hybrid systems
that give you most things automated and online, with the occasional
archival offline snapshots. However, that's far too much detail for
this email.
For the record, my automated backup strategy uses Bacula at its
core. Check it out if you get past the "minimum" stage.
Devin