On Thu, 2007-09-27 at 09:19 +0200, Arno Lehmann wrote: > Hi, > > 27.09.2007 01:17,, Ross Boylan wrote:: > > I've been having really slow backups (13 hours) when I backup a large > > mail spool. I've attached a run report. There are about 1.4M files > > with a compressed size of 4G. I get much better throughput (e.g., > > 2,000KB/s vs 86KB/s for this job!) with other jobs. > > 2MB/s is still not especially fast for a backup to disk, I think. So > your storage disk might also be a factor here. > > > First, does it sound as if something is wrong? I suspect the number of > > files is the key thing, and the mail spool has lots of little files > > (it's used by Cyrus). Is this just life when you have lots of little > > files? > > > > Second, how can I figure out what the problem is? I do have some > > suspicions, but first some basics: > > ------------------------------------------------ > > everything is running on the same box > > 3GHz P4 with one SATA drive as the main drive and 4 older drives, one of > > which is the backup target. > > No noticeable CPU load or disk activity during the backup. I was > > compressing, but that doesn't show up noticeably for CPU use. > > How much memory, and how is the memory usage during backups? 2G of RAM. I'll have to watch it to determine how much is in use. > > > Debian GNU/Linux 2.6.18 with postgresql 8.1, bacula 2.2.13. > > Disk is managed by evms, using LVM. > > The partion being backed up is ext3, and the backup is going to disk (a > > different physical disk, IDE) using Reiser. > > That's definitely a good thing. > > > I am not using snapshotting because that feature is broken right now > > (nothiing to do with bacula). I shut down the cyrus server during the > > backup (desspite some errors in the log around my attempted shutdown, it > > seemed to have worked). > > > > My suspicion is that the TCP/IP transactions are all getting delayed > > (maybe to batch for sending) in a way that usually isn't noticeable, but > > is noticeable when doing lots of quick exchanges locally. > > I don't know anything about issues with TCP delays, and I know Bacula > installations running smoothly on all sorts of hardware and different > OSes. > > I rather suspect the catalog to be the bottle-neck. > > Verifying this might be as easy as running vmstat while the job is > backed up and seeing if there is lots of iowait happening - this does > not necessarily show as hard disk activity. Would tcp induced delays also show up as iowait? > > Are your database and the mail spool on the same disk? This might > explain the slowness you encounter. Yes. > > In this case, I'd suggest to upgrade to Bacula 2.2.4. For two reasons, > actually: There is a serious bug that will hit you one day, and which > is fixed in the current version. Second, the new batch inserts feature > would gain lots of speed if the database throughput really is the > bottle neck for you. I see 2.2.4 is in Debian unstable, so I should be able to pull it in. That would be great if it speeds things up. > > > Not only are > > my bacula components using TCP (I think), but I'm communicating with > > postgres by TCP (I couldn't get authentication working properly with > > unix domain sockets). > > > > While populating the cyrus server I also encountered very slow > > transaction speeds. I think the TCP problem was the cause, though I > > don't have definite confirmation. I ran multiple jobs in parallel to > > populate the cyrus server to get around the slowness of the individual > > parts (I think that at least rules out things like db contention or disk > > contention as culprits in that case). > > As I don't know about the TCP delay I can't comment on this... > > > Unfortunately, AFAIK the tcp delay is not tuneable on Linux; it is with > > BSD. > > > > Here are some relevant parts of bacula-dir.conf: > > > > JobDefs { > > Name = "CornDefaults" > > Type = Backup > > Level = Incremental > > Client = corn-fd > > Storage = File2Storage > > Messages = Standard > > Pool = Default > > Full Backup Pool = Full > > Differential Backup Pool = Differential > > Incremental Backup Pool = Incremental > > Priority = 10 > > Write Bootstrap = "/usr/local/var/spool/bacula/%n.bsr" > > } > > > > ######## Cyrus > > ## really this needs more care: use snapshot, dump db to ascii > > As far as I know, it's sufficient to dump cyrus' database. Given that > dump and a backup of your mail files, a correct cyrus database can be > easily regenerated. Snapshots would be a good thing, perhaps, but > you'd still have to explicitly dump the database as there is no > guarantee that the disk files of the database are always in a > consistent state. cyrus recommends the ascii dump to guard against version changes that would render the binary unusable. http://cyrusimap.web.cmu.edu/twiki/bin/view/Cyrus/Backup has more. You're right: snapshots alone will not assure integrity. ..... > > I'm really unsure about TCP problems, but the situation more or less > looks like the catalog backend would be your problem. Could you try to > have the catalog db on another machine? I've only got the one for now.
Thanks. Ross ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users