On 19/10/15 22:08, Dimitri Maziuk wrote:
> On 10/19/2015 03:53 PM, Thing wrote:
>> Hi,
>>
>> Is anyone backing total volumes of this order?  and if so, what sort of
>> scaling, design, hardware?
> I take it, that's the size of your filesystems? Not the estimated size
> of the backup set (i.e. all cycles in retention period)?
>
>

Assuming it is,

Yes. about 700TB and still growing.

Keeping the individual filesets to 1Tb so that tape run isn't excessive.

Largish changer  - I'm about to retiire a 500-slot neo8000 with 7 LTO5
drives in favour of a 120-slot Scalar i500 with 6 LTO6s.

If you don't have enough slots you'll be feeding it multiple times
during long weekends (we can easily peak at 20 tapes/day if multiple
fulls get kicked off).
If you don't have enough drives you won't keep up, let alone cope with
the inevitable drive failures and 2 day turnaround for a replacement.
You absolutely must have at least 1 more drive than you think you need
to cope with the backup load. Apart from anything else it means you can
run urgent restores without interrupting backups in progress.

Large data safes. You'll need something like a Phoenix FS1903, probably
a couple (these hold about 800 LTOs apiece) and a strong floor for them
to sit on.

The tapes, safes and changer should all sit in close proximity in a
temperature-controlled _clean_ environment, preferably in their own
room, which is accessed as infrequently as possible. Dust kills drives
and human skin is one of the worst contaminants because it's greasy with
most other dust types being abrasive. Consider an air scrubber and
clean-room "flypaper" sticky sheets on the door threshold.

Large (200Gb+), high performance SSD for spool. Consumer drives become a
bottleneck.

Something similar (raid1) for database, 500Gb or so.

Postgresql - just works. Mysql doesn't scale this large very well - It
will work but you'll be constantly fighting with it.

LOTS of ram for the DB box. I have 48Gb in a 5year old machine. It's due
for an upgrade, but just about anything newer than 5 years with a E5 CPU
or better will do the job nicely.

10Gb/s connectivity. You can fudge it with LACP on 1Gb/s but it becomes
a bottleneck. Ditto on the fileservers themselves.

A decent network switch. Huawei 6800 series are nicely specced (1TB/s
throughput) and run rings around equivalently priced Cisco/Juniper kit -
which mostly all use the same Broadcom Trident2/2+/3 chipset anyway.


We run 14 month retention on the backup cycle, with a full every 3
months, nightly incrementals and 4-weekly differentials. Rapidly
changing data in smaller sets gets monthly full backups. Thankfully this
is science data, as financial stuff may need to be retained up to 7 years.

The most common restore is for accidental deletions but we've had to
pull a few fileset restores over the years - usually because someone
cheaped out and didn't RAID their box on the basis "its easy to rebuild".
It never is unless it's a cookie cutter - which they never are after a
week of operation - and it's less disruptive to change a dead drive in a
raidset anyway (this can be done hot on Linux systems using mdraid).

There's only ever been one major central store restore and that was a
runaway rm -rf. Unfortunately one group has a 200TB system which is
beyond warranty but not being replaced because of budgets. It's being
driven hard and sooner or later it's going to drop its bundle. I'm not
looking forward to that day.

Regarding the data safes: People say "Iron Mountain", but backups are
not archives. You're going to cycle the tapes and retrieving them is
much easier if they're local. A good fire safe will survive an intense
fire for 60 minutes and a 10 metre drop (simulating building collapse)
with the insides not going above 50C, but it's best to site your safes
where they're least likely to get that kind of experience and pipe the
data to them and the tape library.

Your single biggest hurdle is getting enough budget for the job.
Management usually won't spend enough on decent storage systems and
they'll heavily resist spending on backup systems. "Raid is not backup"
usually doesn't sink in unless they've been burned a few times.



------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to