I agree with that strategy for non-NetApp NAS devices - divide and conquer by backing up one filespace at a time, using many proxy machines. (Your backup domain controllers got nuthin' else to do at backup time, do they?)
Unless it's mostly non-changing data that you can replicate. Some of my customers with massive NAS devices are using them for image apps. Images get created and never change, so just 1 replicated copy is usually sufficient. No need to trudge through the filetree every day. And Skylar is right on that you need to partition the data into manageable filespace sizes to start with. App folks don't like to hear that, but there are ways (even with Windows!) to mask drive architecture from applications and still have a design that makes it reasonable to back up in a timely fashion. NDMP is the horrible reptile/crocodile brain of the backup world. It hasn't improved any, and you are constantly dumping stuff that hasn't changed, it's massive amounts of data, and you have to keep the tapes a lot longer than you want to, to make sure that you can recovery that 1 file that has extra-long retention requirements. Better to back up just the files that are changed. Once you get an NAS in the 10 TB range, you'll find you need a dead goat every full moon and an unreasonable amount of your time and tape to get those NDMP dumps completed. The reason NAS vendors don't improve on NDMP is not because they can't (to wit the NetApp snapdiff API). Let's just say those boxes aren't designed to help the customer in any way. I wish when looking at a NAS device, customers would calculate the cost of backing it up into the TCO. Then we'd get better stuff on the market. (soapbox off) W -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Skylar Thompson Sent: Monday, December 17, 2012 3:02 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Isilon Backup We only backup via NFS (we have no ACLs so CIFS would be an unnecessary and huge performance hit). Here's what we've done: 1. Partitioned our data such that we can have a large number of filespaces that we backup separately (in hindsight, we might have wanted to go with separate nodes, but oh well). 2. Setup a pool of schedules that are each responsible for some number of filespaces. Each schedule will proxy using asnode to the underlying storage node. 3. Setup one node definition per schedule that will hold one schedule, and that the dsmcad process will connect as. 4. Farm out those schedules to a pool of 5 pizza-box Dell systems with 10GbE adapters running RHEL5. These systems mount each filespace as a separate NFS mount point to keep TSM happy. Using this strategy we're backing up a pair of Isilon clusters with about 2.5PB of aggregate disk space. Presently TSM is tracking 130 million file versions and 1.5PB in backups, and we rarely exceed our 24-hour per-filespace backup window. There's a number of concerns I would have running NDMP backups: 1. It's unclear to me how to avoid doing a full backup. It seems that token-based backups would be an option but the documentation is fairly sparse. 2. It's one more thing to learn and that can break in mysterious ways. 3. When restoring the data you're tied to the specific platform that performed the backup. This means that if you no longer have an Isilon system, you can no longer access the backed up data. This for us was the most scary aspect out of NDMP. -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine On 12/17/12 11:42 AM, Huebner, Andy wrote: > Just general questions on Isilon backup. > How do you backup your Isilon, NDMP or CIFS/NFS? > NDMP FC to tape or Ethernet through the TSM server? > > We are to the point in our project to figure out the backup bit. > > > Andy Huebner >