Roland Arendes wrote:
Hi

This will speed up your dbcheck (and tree building before a restore)
drastically:
Mysql


use bacula;
ALTER TABLE File ADD INDEX (JobId, PathId, FilenameId);


Wait for the index creation to finish (takes some time on a huge db).


Index creation took 20 minutes, and dbcheck -f ran for 27 hours before I killed it and wiped the entire machine to upgrade to RHEL4.


This brings me to an interesting question regarding bacula and scalability. I've seen multiple people comment that 1.5 TB of data isn't a problem for bacula, but I have my doubts after this episode.

Basically, the dbcheck program was operating as it should. However, the operation I asked it to perform ran for over *three days* without sign of completing anytime soon. After performing the proposed optimization, the operation still ran for 26+ hours. This effectively makes the operation useless. An operation that runs on a locked catalog for over 26 hours is not acceptable in a production backup system.

Currently, a Full backup of my /home volume is 480GB in just over 4 million files. That's just /home. In addition, I have 15 other servers doing special things and about 40 workstations to backup. I'll estimate a full backup of our site at 2500 to 3000 GB of data in 10 *million* files.

We're testing our new NFS server which will replace our current 480 GB volume with a 1.5 TB volume. I can reasonably expect the number of files I backup to at least double in the next few months, and eventually triple.

If I want to keep 2 fulls in the catalog at any given time, I should expect *at least* 50 million file records at any given time.

Am I naive trying to cram this much information into one MySQL database? Should I be splitting this up across multiple catalogs? Should I investigate other optimizations to deal with this volume of information inside the catalog?

It may be that bacula isn't yet ready to manage a catalog of this volume, which is perfectly fine. I'm hoping to get some feedback on the topic of scalability, as I haven't really seen it mentioned much on this mailing list or in the documentation. It seems like a big issue as bacula matures and becomes a viable enterprise solution.

What are bacula's limitations in terms of long running operations versus catalog size? Are they linearly related, exponentially, etc...?

Regards,
--
Jeff McCune
OSU Department of Mathematics System Support
(614) 292-4962
gpg --keyserver pgp.mit.edu --recv-key BAF3211A


------------------------------------------------------- This SF.net email is sponsored by: 2005 Windows Mobile Application Contest Submit applications for Windows Mobile(tm)-based Pocket PCs or Smartphones for the chance to win $25,000 and application distribution. Enter today at http://ads.osdn.com/?ad_id=6882&alloc_id=15148&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to