Hi Ana, Thanks again for the help!
Yes, the database is large. The File table has 370+ million records and Bacula is pruning. I'll see what we can do with mysqltuner. I have a feeling I'm occasionally getting the timeout error because there are so many records. If I can't find a cure for the situation, I'm thinking of splitting up our backups using two catalog servers -- one for production and one for test/dev. Do you think this is wise? Currently, our DBAs don't support Postgres so it may not be an option. Would Oracle be an option instead? I'll have to check on the "--enable-batch-insert" option. Is there a Bacula command to see what options Bacula was built with? I did not set up our installation and am new to Bacula. Would you have any idea why a new duplicate job is starting when one was rescheduled? I looked over the configs I know about, but could not find an option that starts a new job given any situation. At least the job will have two more chances to run if I can prevent the new dup. job from starting when the lock is detected. I know that won't cure the database problem, but I have a feeling the jobs might complete successfully on the rescheduled tries (thinking the lock will be gone by then) until the DB can be tuned. Warmest regards, -craig On Mon, Sep 14, 2015 at 4:54 PM, Ana Emília M. Arruda < emiliaarr...@gmail.com> wrote: > Hello Craig, > > You're welcome! I hope give you some tips here. It seems you have a > database tunning issue because of "Lock wait timeout exceeded; try > restarting transaction". I'm not sure about your database size, but based > on your JobIds, it seems large. You can try http://mysqltuner.com/ if > you're using MySQL. If you have a really large database, you can think > about migrating to PostgreSQL if tunning of MySQL do not solve this problem. > > Have you build your bacula with "--enable-batch-insert" option? This is a > good idea when dealing with large number of files. > > Best regards, > Ana > > On Mon, Sep 14, 2015 at 12:40 AM, Craig Shiroma <shiroma.crai...@gmail.com > > wrote: > >> Hi Ana, >> >> I'm using 7.0.5. Thanks for the help! >> >> -craig >> >> On Sat, Sep 12, 2015 at 2:10 PM, Ana Emília M. Arruda < >> emiliaarr...@gmail.com> wrote: >> >>> Hello Craig, >>> >>> Which Bacula version are you using? >>> >>> Best regards, >>> Ana >>> >>> On Fri, Sep 11, 2015 at 4:14 PM, Craig Shiroma < >>> shiroma.crai...@gmail.com> wrote: >>> >>>> My apologies. I hit the send button before entering a subject. >>>> >>>> On Fri, Sep 11, 2015 at 9:13 AM, Craig Shiroma < >>>> shiroma.crai...@gmail.com> wrote: >>>> >>>>> Hello All, >>>>> >>>>> I'm getting the following problem occasionally: >>>>> 2015-09-10 23:47:24bacula-dir JobId 140080: Fatal error: JobId 139901 >>>>> already running. Duplicate job not allowed. >>>>> >>>>> Due to this type of error: >>>>> 2015-09-10 23:47:22bacula-dir JobId 139901: Fatal error: >>>>> sql_create.c:870 Fill File table Query failed: INSERT INTO File >>>>> (FileIndex, >>>>> JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT batch.FileIndex, >>>>> batch.JobId, Path.PathId, Filename.FilenameId,batch.LStat, batch.MD5, >>>>> batch.DeltaSeq FROM batch JOIN Path ON (batch.Path = Path.Path) JOIN >>>>> Filename ON (batch.Name = Filename.Name): ERR=Lock wait timeout exceeded; >>>>> try restarting transaction >>>>> >>>>> This happens when a database lock has timed out on a backup and the >>>>> job is rescheduled. For some reason, it seems a new job is starting up as >>>>> soon as the error is detected. I posted about this issue earlier and >>>>> someone mentioned it is happening because I configured Bacula to do that >>>>> (or at least that's the impression I got from the post). Would anyone >>>>> know >>>>> which config would have the setting to start up a new job for the client >>>>> backup when an error like a lock is detected? So far, I've only found >>>>> settings for rescheduling, not restarting such as below: >>>>> >>>>> Reschedule Interval = 1 hour >>>>> Reschedule Times = 3 >>>>> Cancel Lower Level Duplicates = yes >>>>> Allow Duplicate Jobs = no >>>>> >>>>> Obviously, the backup is getting canceled because of the last two >>>>> settings above. But, what setting is causing a new job to be created when >>>>> I get a lock timeout error is detected that says it has rescheduled the >>>>> job >>>>> for 3600 minutes later? >>>>> >>>>> I realize it appears I may need to do some database fixing/turning. >>>>> But, my immediate wonder is why a new job is being created when one has >>>>> been rescheduled? >>>>> >>>>> Regards, >>>>> -craig >>>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bacula-users@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>>> >>>> >>> >> >
------------------------------------------------------------------------------
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users