Hi Ana,

Thanks again for the help!

Yes, the database is large.  The File table has 370+ million records and
Bacula is pruning.  I'll see what we can do with mysqltuner.  I have a
feeling I'm occasionally getting the timeout error because there are so
many records.  If I can't find a cure for the situation, I'm thinking of
splitting up our backups using two catalog servers -- one for production
and one for test/dev.  Do you think this is wise?  Currently, our DBAs
don't support Postgres so it may not be an option.  Would Oracle be an
option instead?

I'll have to check on the "--enable-batch-insert" option.  Is there a
Bacula command to see what options Bacula was built with?  I did not set up
our installation and am new to Bacula.

Would you have any idea why a new duplicate job is starting when one was
rescheduled?  I looked over the configs I know about, but could not find an
option that starts a new job given any situation.  At least the job will
have two more chances to run if I can prevent the new dup. job from
starting when the lock is detected.  I know that won't cure the database
problem, but I have a feeling the jobs might complete successfully on the
rescheduled tries (thinking the lock will be gone by then) until the DB can
be tuned.

Warmest regards,
-craig

On Mon, Sep 14, 2015 at 4:54 PM, Ana Emília M. Arruda <
emiliaarr...@gmail.com> wrote:

> Hello Craig,
>
> You're welcome! I hope give you some tips here. It seems you have a
> database tunning issue because of "Lock wait timeout exceeded; try
> restarting transaction". I'm not sure about your database size, but based
> on your JobIds, it seems large. You can try http://mysqltuner.com/  if
> you're using MySQL. If you have a really large database, you can think
> about migrating to PostgreSQL if tunning of MySQL do not solve this problem.
>
> Have you build your bacula with "--enable-batch-insert" option? This is a
> good idea when dealing with large number of files.
>
> Best regards,
> Ana
>
> On Mon, Sep 14, 2015 at 12:40 AM, Craig Shiroma <shiroma.crai...@gmail.com
> > wrote:
>
>> Hi Ana,
>>
>> I'm using 7.0.5.  Thanks for the help!
>>
>> -craig
>>
>> On Sat, Sep 12, 2015 at 2:10 PM, Ana Emília M. Arruda <
>> emiliaarr...@gmail.com> wrote:
>>
>>> Hello Craig,
>>>
>>> Which Bacula version are you using?
>>>
>>> Best regards,
>>> Ana
>>>
>>> On Fri, Sep 11, 2015 at 4:14 PM, Craig Shiroma <
>>> shiroma.crai...@gmail.com> wrote:
>>>
>>>> My apologies.  I hit the send button before entering a subject.
>>>>
>>>> On Fri, Sep 11, 2015 at 9:13 AM, Craig Shiroma <
>>>> shiroma.crai...@gmail.com> wrote:
>>>>
>>>>> Hello All,
>>>>>
>>>>> I'm getting the following problem occasionally:
>>>>> 2015-09-10 23:47:24bacula-dir JobId 140080: Fatal error: JobId 139901
>>>>> already running. Duplicate job not allowed.
>>>>>
>>>>> Due to this type of error:
>>>>> 2015-09-10 23:47:22bacula-dir JobId 139901: Fatal error:
>>>>> sql_create.c:870 Fill File table Query failed: INSERT INTO File 
>>>>> (FileIndex,
>>>>> JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT batch.FileIndex,
>>>>> batch.JobId, Path.PathId, Filename.FilenameId,batch.LStat, batch.MD5,
>>>>> batch.DeltaSeq FROM batch JOIN Path ON (batch.Path = Path.Path) JOIN
>>>>> Filename ON (batch.Name = Filename.Name): ERR=Lock wait timeout exceeded;
>>>>> try restarting transaction
>>>>>
>>>>> This happens when a database lock has timed out on a backup and the
>>>>> job is rescheduled.  For some reason, it seems a new job is starting up as
>>>>> soon as the error is detected.  I posted about this issue earlier and
>>>>> someone mentioned it is happening because I configured Bacula to do that
>>>>> (or at least that's the impression I got from the post).  Would anyone 
>>>>> know
>>>>> which config would have the setting to start up a new job for the client
>>>>> backup when an error like a lock is detected?  So far, I've only found
>>>>> settings for rescheduling, not restarting such as below:
>>>>>
>>>>>    Reschedule Interval = 1 hour
>>>>>     Reschedule Times = 3
>>>>>     Cancel Lower Level Duplicates = yes
>>>>>     Allow Duplicate Jobs = no
>>>>>
>>>>> Obviously, the backup is getting canceled because of the last two
>>>>> settings above.  But, what setting is causing a new job to be created when
>>>>> I get a lock timeout error is detected that says it has rescheduled the 
>>>>> job
>>>>> for 3600 minutes later?
>>>>>
>>>>> I realize it appears I may need to do some database fixing/turning.
>>>>> But, my immediate wonder is why a new job is being created when one has
>>>>> been rescheduled?
>>>>>
>>>>> Regards,
>>>>> -craig
>>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Bacula-users mailing list
>>>> Bacula-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to