On Tue, Apr 06, 2010 at 09:01:13AM -0700, ebollengier wrote: > Hello Graham,
Hello, thanks for your reply. > Graham Keeling wrote: > > > > Hello, > > I'm using bacula-5.0.1. > > I have a 2.33GHz CPU with 2G of RAM. > > I am using MySQL. > > I had a VirtualFull scheduled for my client. > > > > My log says the following: > > > > Apr 4 18:56:02 Start Virtual Backup JobId 56, > > Job=Linux:cvs.2010-04-04_18.56.00_03 > > Apr 4 18:56:02 This Job is not an Accurate backup so is not equivalent > > to a Full backup. > > Apr 6 03:56:12 Bootstrap records written to > > /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr > > Apr 6 10:23:20 Ready to read from volume "backup-0002" on device "Tower > > 2.0" (/write/mnt/Tower 2). > > Apr 6 10:23:20 Labeled new Volume "backup-0046" on device "Tower 2.1" > > (/write/mnt/Tower 2). > > ...and now, backup-0046 is being written to. > > > > At about Apr 6 10:00, I was logged on to the director, and attempting to > > run > > a 'status, director' command on bconsole. > > bconsole printed a few bits of information, then paused. Presumably > > because JobId 56 had locked the database. After 10:23:20, the command > > printed > > the rest of its information and gave me the prompt back. > > > > At about Apr 6 10:00, I logged into mysql, and ran 'show full > > processlist;', > > which gave the following. > > > > | Id | User | Host | db | Command | Time | State | Info > > | 3032 | root | localhost | bacula | Query | 22464 | Locked | UPDATE > > Job SE > > T JobStatus='R',Level='F',StartTime='2010-04-06 > > 03:56:14',ClientId=2,JobTDate=12 > > 70522574,PoolId=7,FileSetId=2 WHERE JobId=56 > > > > | 3033 | root | localhost | bacula | Query | 22464 | executing | SELECT > > Path.P > > ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT > > FileId, > > Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS > > Filena > > meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate, > > PathId, Fil > > enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job > > USING (Job > > Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate, > > PathId > > , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job ON > > (BaseJobId > > = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp > > GROUP BY P > > athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId > > FROM B > > aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN > > (22,23,31,34,42, > > 48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND > > T1.PathId > > = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename > > ON (Fi > > lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId = > > Temp.PathId) WH > > ERE FileIndex > 0 ORDER BY Temp.JobId, FileIndex ASC | > > > > 'Time 22464' is equivalent to about 6 hours, so I presume that after > > 'Bootstrap > > records written', bacula has been waiting for this sql command, during > > which > > time it could not do anything else. Backups of other clients were failing > > because I had Max Wait Time set to something less than 6 hours. > > > > The following is a listing of the number of files that each previous > > backup > > said it had backed up, which doesn't seem extraordinarily large to my > > mind. > > > > Full: 810,400 > > Incr: 52,487 > > Incr: 52,485 > > Incr: 52,485 > > Incr: 332,649 > > Incr: 52,492 > > Incr: 52,593 > > Incr: 332,677 > > Incr: 52,497 > > Incr: 52,497 > > Incr: 52,497 > > > > > > So, my questions are: > > > > Does it really need to take so long? > > Is it expected that this Job blocks everything else for 6 hours? > > > > I think that this part of the code doesn't use a dedicated connection, it > would be nice to > improve this area. (other part of the code are using a specific connection > to do this work) > > > Graham Keeling wrote: > > > > Is there anything I can do that would speed it up? > > Perhaps even more importantly, what was it doing for the 33 hour period > > between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the > > bootstrap > > records? > > > > On the Path table, I have these: > > PRIMARY KEY(PathId), > > INDEX (Path(255)) > > > > On the File table, I have these: > > PRIMARY KEY(FileId), > > INDEX (JobId, PathId, FilenameId), > > INDEX (PathId), > > INDEX (FilenameId) > > > > On the Job table, I have these: > > PRIMARY KEY(JobId), > > INDEX (Name(128)), > > INDEX (ClientId) > > > > Did you run tests with the original schema? It looks like you removed > important index such as the one > on JobId and you added others, and i'm not sure that MySQL can play properly > with composed indexes (it doesn't looks like) in this treatment. In future, to avoid generating this sort of complaint, I will not fiddle with the default bacula indexes. However, I think that the problem is not to do with this. I shall explain below... > I've made tests with 400,000,000 files and 40,000,000 filenames over 600 > jobs, and it doesn't run fast as PostgreSQL (factor 6 between both engines), > but it should handle the job rather quickly. (on my pc, for 2M files, it's > between 1min and 1min30 (15seconds with PostgreSQL)) Here is the command that mysql takes hours over, and hence the one that I am using to test. In all of my tests, the result is the same whether I am using the default bacula indexes or my additional ones. SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate, PathId, FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job ON (BaseJobId = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp GROUP BY PathId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (22,23,31,34,42,48,52) ) ) OR Job.JobId IN (22,23,31,34,42,48,52) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId = Temp.PathId) WHERE FileIndex > 0 ORDER BY Temp.JobId, FileIndex ASC; Since I am not using base jobs, I did the experiment of removing the lines from the query to do with base jobs. When I did that, the query changed from taking hours, to taking under a second. I then tracked it down to this specific part (i.e, remove this, and get a truly magnificent speed-up): (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (22,23,31,34,42,48,52) ) ) OR I think that mysql is repeatedly calling this section for each row that the SELECT returns. Now the question is what to do about it? ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users