Does that include a job that requests regular intervention? If so, I guess I 
need to recompile Bacula before attempting to backup 50 terabytes of data onto 
LTO6 media. I’m fairly new to Bacula and I was planning to just change tapes 
every day while in the office, but if it’s just going to crash for no good 
reason after a week then it hardly seems worth bothering with. There should 
really be something in the documentation about this, since it’s kind of an ugly 
surprise to come across just when you think you’ve got everything working.

Any way to disable this misfeature entirely? I am quite certain that I’ll know 
when my backups aren’t working anymore (because I’ll stop getting emails 
telling me to change tapes). At a minimum, this really seems like the sort of 
“gotcha” that should be mentioned in the FAQ…

Yale

> On Mar 2, 2020, at 03:20, Josh Fisher <jfis...@pvct.com> wrote:
> 
> Bacula has a built-in watchdog that kills a job that runs for more than 6 
> days. That period can be extended at compile time, so you have to compile 
> your own binaries after a change to the source. I don't remember where in the 
> source, but this has come up before and should be searchable.
> 
> If you have already extended the watchdog timout, a signal 11 is almost 
> always a software bug and the devs should be able to tell where in the code 
> this happened from the traceback. That said, Bacula running for a long time 
> using lots of pointers is also a decent test of memory, as well as i/o. 
> Hardware errors, anything that causes a bit flip in RAM, usually results in a 
> signal 11. But it is far more likely to be a software issue and you should 
> file a bug report..
> 
> 
>> On 3/1/2020 6:22 PM, Chaz Vidal wrote:
>> Greetings all,
>> Our Bacula system crashed on Friday with a segmentation violation.
>> 
>> The system has been attempting to do a full backup of over 130TB of data 
>> over the past few weeks which we've appeared to have lost because of the 
>> crash.
>> 
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: Bacula interrupted by 
>> signal 11: Segmentation violation
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: Kaboom! bacula-dir, 
>> bacula-dir got signal 11 - Segmentation violation at 28-Feb-2020 09:56:31. 
>> Attempting traceback.
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: Kaboom! exepath=/usr/sbin/
>> Feb 28 09:56:31 <<servername>> bacula-dir: Bacula interrupted by signal 11: 
>> Segmentation violation
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: Calling: 
>> /usr/sbin/btraceback /usr/sbin/bacula-dir 4211 /var/lib/bacula
>> Feb 28 09:56:31 <<servername>> postfix/smtpd[59719]: connect from 
>> localhost[127.0.0.1]
>> Feb 28 09:56:31 <<servername>> postfix/smtpd[59719]: 71CC36008A: 
>> client=localhost[127.0.0.1]
>> Feb 28 09:56:31 <<servername>> postfix/cleanup[59722]: 71CC36008A: 
>> message-id=<20200227232631.71CC36008A@<<servername>>.company.com>
>> Feb 28 09:56:31 <<servername>> postfix/qmgr[14399]: 71CC36008A: 
>> from=<root@<<servername>>.company.com>, size=593, nrcpt=1 (queue active)
>> Feb 28 09:56:31 <<servername>> postfix/smtpd[59719]: disconnect from 
>> localhost[127.0.0.1] helo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: It looks like the traceback 
>> worked...
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: LockDump: 
>> /var/lib/bacula/bacula.4211.traceback
>> Feb 28 09:56:31 <<servername>> bacula-dir[4211]: bacula-dir: 
>> lockmgr.c:1221-0 lockmgr disabled
>> 
>> I do not know how to read a traceback file to understand what may have been 
>> going on.  We are attempting to restart the backup again but unless we 
>> understand what happened the crash may appear again.
>> 
>> We are running Bacula Version: 9.4.2
>> 
>> Appreciate if anyone can share any insight?
>> 
>> Attempt to dump current JCRs. njcrs=7
>> threadid=0x7fb497491f40 JobId=0 JobStatus=R jcr=0x55980a04a4f8 
>> name=*JobMonitor*.2020-02-11_15.29.48_01
>>         use_count=1 killable=0
>>         JobType=I JobLevel=
>>         sched_time=11-Feb-2020 15:29 start_time=11-Feb-2020 15:29
>>         end_time=01-Jan-1970 09:30 wait_time=01-Jan-1970 09:30
>>         db=(nil) db_batch=(nil) batch_started=0
>>         wstore=0x55980a01ff28 rstore=0x55980a01ff28 wjcr=(nil) 
>> client=0x55980a026128 reschedule_count=0 SD_msg_chan_started=0
>> threadid=0x7fb495897700 JobId=104686 JobStatus=R jcr=0x7fb48806aea8 
>> name=job1.2020-02-11_17.38.56_13
>>         use_count=2 killable=1
>>         JobType=B JobLevel=F
>>         sched_time=11-Feb-2020 17:38 start_time=11-Feb-2020 17:38
>>         end_time=01-Jan-1970 09:30 wait_time=21-Feb-2020 16:50
>>         db=0x7fb4880059a8 db_batch=(nil) batch_started=0
>>         wstore=0x7fb48803fc18 rstore=(nil) wjcr=(nil) client=0x7fb4880481a8 
>> reschedule_count=0 SD_msg_chan_started=1
>> BDB=0x7fb4880059a8 db_name=bacula db_user=bacula connected=true
>>         cmd="UPDATE Media SET InChanger=0, Slot=0 WHERE Slot=25 AND 
>> StorageId IN (10) AND MediaId!=794" changes=1814
>>         RWLOCK=0x7fb4880059c0 w_active=0 w_wait=0
>> threadid=0x7fb47e7fc700 JobId=104687 JobStatus=R jcr=0x7fb488068978 
>> name=job2.2020-02-11_17.40.43_14
>>         use_count=2 killable=1
>>         JobType=B JobLevel=F
>>         sched_time=11-Feb-2020 17:40 start_time=11-Feb-2020 17:40
>>         end_time=01-Jan-1970 09:30 wait_time=01-Jan-1970 09:30
>>         db=0x7fb4880059a8 db_batch=(nil) batch_started=0
>>         wstore=0x7fb48803fc18 rstore=(nil) wjcr=(nil) client=0x7fb4880481a8 
>> reschedule_count=0 SD_msg_chan_started=1
>> BDB=0x7fb4880059a8 db_name=bacula db_user=bacula connected=true
>>         cmd="UPDATE Media SET InChanger=0, Slot=0 WHERE Slot=25 AND 
>> StorageId IN (10) AND MediaId!=794" changes=1814
>>         RWLOCK=0x7fb4880059c0 w_active=0 w_wait=0
>> threadid=0x7fb43f7fe700 JobId=104928 JobStatus=R jcr=0x7fb44805fa88 
>> name=job3.2020-02-14_15.47.06_47
>>         use_count=2 killable=1
>>         JobType=B JobLevel=F
>>         sched_time=14-Feb-2020 15:46 start_time=14-Feb-2020 15:47
>>         end_time=01-Jan-1970 09:30 wait_time=27-Feb-2020 22:21
>>         db=0x7fb4880059a8 db_batch=(nil) batch_started=0
>>         wstore=0x7fb448034678 rstore=(nil) wjcr=(nil) client=0x7fb44803c148 
>> reschedule_count=0 SD_msg_chan_started=1
>> BDB=0x7fb4880059a8 db_name=bacula db_user=bacula connected=true
>>         cmd="UPDATE Media SET InChanger=0, Slot=0 WHERE Slot=25 AND 
>> StorageId IN (10) AND MediaId!=794" changes=1814
>>         RWLOCK=0x7fb4880059c0 w_active=0 w_wait=0
>> threadid=0x7fb43e7fc700 JobId=105616 JobStatus=R jcr=0x55980a005fe8 
>> name=job4.2020-02-21_21.30.01_16
>>         use_count=2 killable=1
>>         JobType=B JobLevel=F
>>         sched_time=21-Feb-2020 21:30 start_time=24-Feb-2020 23:36
>>         end_time=01-Jan-1970 09:30 wait_time=01-Jan-1970 09:30
>>         db=0x7fb4880059a8 db_batch=(nil) batch_started=0
>>         wstore=0x7fb448033b78 rstore=(nil) wjcr=(nil) client=0x7fb44803e9e8 
>> reschedule_count=0 SD_msg_chan_started=1
>> BDB=0x7fb4880059a8 db_name=bacula db_user=bacula connected=true
>>         cmd="UPDATE Media SET InChanger=0, Slot=0 WHERE Slot=25 AND 
>> StorageId IN (10) AND MediaId!=794" changes=1814
>>         RWLOCK=0x7fb4880059c0 w_active=0 w_wait=0
>> threadid=0x7fb43effd700 JobId=0 JobStatus=C jcr=0x7fb47800b2e8 
>> name=-Console-.2020-02-27_08.39.19_09
>>         use_count=1 killable=0
>>         JobType=U JobLevel=F
>>         sched_time=27-Feb-2020 08:39 start_time=27-Feb-2020 08:39
>>         end_time=01-Jan-1970 09:30 wait_time=01-Jan-1970 09:30
>>         db=0x7fb4880059a8 db_batch=(nil) batch_started=0
>>         wstore=0x7fb448035cd8 rstore=0x7fb448034c18 wjcr=(nil) 
>> client=0x7fb44803ae18 reschedule_count=0 SD_msg_chan_started=0
>> BDB=0x7fb4880059a8 db_name=bacula db_user=bacula connected=true
>>         cmd="UPDATE Media SET InChanger=0, Slot=0 WHERE Slot=25 AND 
>> StorageId IN (10) AND MediaId!=794" changes=1814
>>         RWLOCK=0x7fb4880059c0 w_active=0 w_wait=0
>> threadid=0x7fb45f7fe700 JobId=0 JobStatus=C jcr=0x7fb3f400e148 
>> name=-Console-.2020-02-28_09.00.13_35
>>         use_count=1 killable=0
>>         JobType=U JobLevel=F
>>         sched_time=28-Feb-2020 09:00 start_time=28-Feb-2020 09:00
>>         end_time=01-Jan-1970 09:30 wait_time=01-Jan-1970 09:30
>>         db=(nil) db_batch=(nil) batch_started=0
>>         wstore=0x7fb448035cd8 rstore=0x7fb448034c18 wjcr=(nil) 
>> client=0x7fb44803ae18 reschedule_count=0 SD_msg_chan_started=0
>> List plugins. Hook count=0
>> 
>> 
>> 
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 
> 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to