Hi Kern, Thank you for the info! We're using MySQL 5.6 Percona Server, Release 68.0, Revision 656.
Would this setting cause the problem? innodb_lock_wait_timeout = 100 Is it too high or too low or has no bearing on the problem? Thanks again, -craig On Thu, Aug 6, 2015 at 9:26 AM, Kern Sibbald <k...@sibbald.com> wrote: > On 06.08.2015 18:46, Bryn Hughes wrote: > > I think what Kern is getting at is that your database is what threw the > error, not Bacula. Whatever DB you are using is what is having the issue. > > > Yes. That is exactly what I was implying. > > The rest of this is directed to Craig: > If you are using MariaDB (I have no indication that you are), please be > aware that it may be a very good database, maybe even better than MySQL, > but Bacula is built and tested against MySQL, and if you use binaries that > were built for MySQL, you could run into problems by using MariaDB. Even > if your binaries were explicitly built with MariaDB, it may not be > compatible with the way Bacula works. Bacula has a tendency to push > databases to the extreme, and it works well with MySQL and PostgreSQL, but > possibly not with other databases. I bring up MariaDB because it has been > mentioned in another posting to this list. > > I would be very surprised if your problem has anything to do with Accurate > -- the database routines know nothing about accurate and none of the data > is different. It is more likely due to the VM environment or to some build > or version problem with MySQL (or MariaDB). > > Best regards, > Kern > > > Bryn > > On 2015-08-06 09:11 AM, Craig Shiroma wrote: > > Hi Kern, > > Thank you very much for the reply! Would you have any suggestions on what > may be causing this problem or how I can debug it? Obviously, I'm > encountering deadlocks when accurate backup runs on some of our hosts and > we want to use accurate backup on all of our hosts if possible. > > Warmest regards, > -craig > > On Thu, Aug 6, 2015 at 12:11 AM, Kern Sibbald <k...@sibbald.com> wrote: > >> On 06.08.2015 10:15, Craig Shiroma wrote: >> >> Hello again, >> >> I just thought I'd update this post with more information in hopes of >> getting some explanation for the deadlocks. >> >> I ran with Accurate backup on our test VMs (RHEL) for a couple of days >> and got the same errors on some VMs that were running accurate and some >> that were not. These hosts were running concurrently. I would say 90% of >> the hosts that were configured to use Accurate finished successfully. >> However, there were a few that failed with the deadlock error -- some that >> were configured to use accurate and some that were not configured to use >> accurate. Also, on all of these, a second job started for each of the >> affected hosts right after Bacula detected the deadlock even though it said >> a reschedule would happen 3600 seconds later (the 3600 seconds is correct). >> >> Tonight, I disabled accurate on all hosts and the deadlocks did not >> happen. No errors were detected and all the backups finished successfully. >> >> Some questions... >> 1. Can I back up multiple hosts concurrently with some hosts configured >> to use accurate and some configured not to use accurate? Or, is it an all >> or none thing, meaning all hosts that run concurrently must either be using >> accurate backup or not using accurate backup (cannot mix the two)? >> >> 2. It seems like the hosts that get out of the starting gate first are >> the ones affected. I am configured to run 50 jobs concurrently. Again, no >> problems with accurate turned off on all hosts for months now. >> >> 3. Why is Bacula spinning off a new job right away after it detects the >> deadlock for each affected job instead of waiting until the rescheduled job >> runs? I verified that there were no duplicate jobs in the queue before the >> backups started running, no jobs were running before the start of the >> backups, and I did not start any of these backups manually to cause a >> second job to appear. >> >> >> Bacula is not aware of any SQL internal deadlocks. >> >> >> From the INNODB Monitor output: >> >> TRANSACTION: >> TRANSACTION 208788977, ACTIVE 1 sec setting auto-inc lock >> mysql tables in use 4, locked 4 >> 9 lock struct(s), heap size 1184, 5 row lock(s) >> MySQL thread id 50808, OS thread handle 0x7f8f2c3b4700, query id 29558637 >> <host> 192.168.10.99 bacula Sending data >> INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, >> DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId, >> Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch JOIN >> Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = >> Filename.Name) >> WAITING FOR THIS LOCK TO BE GRANTED: >> TABLE LOCK table `bacula`.`File` trx id 208788977 lock mode AUTO-INC >> waiting >> WE ROLL BACK TRANSACTION (2) >> >> I am running Bacula 7.0.5 on RHEL 6.6 x64 with Director, Storage and >> Catalog running on separate RHEL 6.6 hosts. Our clients are RHEL 6's, 5's >> and Windows Servers 2008 and 2012R2. >> >> Any help would be much appreciated. >> >> Warmest regards, >> -craig >> >> On Tue, Aug 4, 2015 at 1:56 PM, Craig Shiroma <shiroma.crai...@gmail.com> >> wrote: >> >>> BTW, I suppose there could've been two jobs for the host(s) in >>> scheduling queue. If this was the case, is there a way to find out after >>> the fact? If this did actually happen, what could cause duplicate jobs to >>> be scheduled on the same day at the same time? I know no one manually ran >>> the jobs in question. Again, this only was a problem for a few of the jobs >>> that ran last night, not all of them and some to do accurate backup and >>> some not. >>> >>> Regards, >>> -craig >>> >>> On Tue, Aug 4, 2015 at 9:27 AM, Craig Shiroma <shiroma.crai...@gmail.com >>> > wrote: >>> >>>> Hello, >>>> >>>> I had a few backups fail last night with the following error: >>>> >>>> 2015-08-03 18:02:46bacula-dir JobId 123984: b INTO File (FileIndex, >>>> JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT batch.FileIndex, >>>> batch.JobId, Path.PathId, Filename.FilenameId,batch.LStat, batch.MD5, >>>> batch.DeltaSeq FROM batch JOIN Path ON (batch.Path = Path.Path) JOIN >>>> Filename ON (batch.Name = Filename.Name): ERR=Deadlock found when trying to >>>> get lock; try restarting transaction >>>> >>>> The only thing I did yesterday was switch a bunch of backups to use >>>> Accurate backup and restart bacula-dir and bacula-sd after that. However, >>>> the above problem also occurred on some hosts that was not set to use >>>> Accurate backup. From the log, it seems like two jobs for this host was >>>> scheduled to run at 18:00 because the second job started and found a >>>> duplicate job (job 123984) and canceled the backup. I know there were no >>>> jobs running before 18:00 so 123984 was not an old job still running. Same >>>> with the other jobs that were canceled because of the above situation. >>>> >>>> Anyway, does anyone have an idea what would cause this, especially how >>>> the second job got shot into the system. After the deadlock error, Bacula >>>> said it would reschedule the job. However the second job started right >>>> after the deadlock error instead of one hour later which makes me think >>>> that there were two jobs for this host scheduled to run at 18:00. >>>> >>>> Thank you in advance, >>>> -craig >>>> >>> >>> >> >> >> ------------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> Bacula-users mailing >> listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users >> >> >> > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > Bacula-users mailing > listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users > > > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > Bacula-users mailing > listBacula-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/bacula-users > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > >
------------------------------------------------------------------------------
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users