Re: Looking for suggestions to deal with large backups not completing in 24-hours

Lars Henningsen Wed, 18 Jul 2018 06:08:56 -0700

@All

possibly the biggest issue when backing up massive file systems in parallel 
with multiple dsmc processes is expiration. Once you back up a directory with 
“subdir no”, a no longer existing directory object on that level is expired 
properly and becomes inactive. However everything underneath that remains 
active and doesn’t expire (ever) unless you run a “full” incremental on the 
level above (with “subdir yes”) - and that kind of defeats the purpose of 
parallelisation. Other pitfalls include avoiding swapping, keeping log files 
consistent (dsmc doesn’t do thread awareness when logging - it assumes being 
alone), handling the local dedup cache, updating backup timestamps for a file 
space on the server, distributing load evenly across multiple nodes on a 
scale-out filer, backing up from snapshots, chunking file systems up into even 
parts automatically so you don’t end up with lots of small jobs and one big 
one, dynamically distributing load across multiple “proxies” if one isn’t 
enough, handling exceptions, handling directories with characters you can’t 
parse to dsmc via the command line, consolidating results in a single, 
comprehensible overview similar to the summary of a regular incremental, being 
able to do it all in reverse for a massively parallel restore… the list is 
quite long.


We developed MAGS (as mentioned by Del) to cope with all that - and more. I can 
only recommend trying it out for free.

Regards

Lars Henningsen
General Storage

Re: Looking for suggestions to deal with large backups not completing in 24-hours

Reply via email to