On Tue, Aug 11, 2009 at 9:39 AM, Ed Spencer<ed_spen...@umanitoba.ca> wrote:
> We backup 2 filesystems on tuesday, 2 filesystems on thursday, and 2 on
> saturday. We backup to disk and then clone to tape. Our backup people
> can only handle doing 2 filesystems per night.
>
> Creating more filesystems to increase the parallelism of our backup is
> one solution but its a major redesign of the of the mail system.

What is magical about a 1:1 mapping of backup job to file system?
According to the Networker manual[1], a save set in Networker can be
configured to back up certain directories.  According to some random
documentation about Cyrus[2], mail boxes fall under a pretty
predictable hierarchy.

1. http://oregonstate.edu/net/services/backups/clients/7_4/admin7_4.pdf
2. http://nakedape.cc/info/Cyrus-IMAP-HOWTO/components.html

Assuming that the way that your mailboxes get hashed fall into a
structure like $fs/b/bigbird and $fs/g/grover (and not just
$fs/bigbird and $fs/grover), you should be able to set a save set per
top level directory or per group of a few directories.  That is,
create a save set for $fs/a, $fs/b, etc. or $fs/a - $fs/d, $fs/e -
$fs/h, etc.  If you are able to create many smaller save sets and turn
the parallelism up you should be able to drive more throughput.

I wouldn't get too worried about ensuring that they all start at the
same time[3], but it would probably make sense to prioritize the
larger ones so that they start early and the smaller ones can fill in
the parallelism gaps as the longer-running ones finish.

3. That is, there is sometimes benefit in having many more jobs to run
than you have concurrent streams.  This avoids having one save set
that finishes long after all the others because of poorly balanced
save sets.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to