> On Jan 18, 2019, at 11:24 AM, Josh Fisher <jfis...@pvct.com> wrote:
> 
> On 1/17/2019 12:17 PM, Dan Langille wrote:
> 
>> I was noticing this in my /var/log/messages:
>> 
>> Jan 17 17:04:00 slocum kernel: pid 52623 (check_bacula), uid 181: exited on 
>> signal 11
>> Jan 17 17:04:21 slocum kernel: pid 53805 (check_bacula), uid 181: exited on 
>> signal 11
>> 
>> I tracked it down to a host which was not up:
>> 
>> $ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]'
>> Segmentation fault
>> 
>> real 1m15.101s
>> user 0m0.006s
>> sys  0m0.000s
>> 
>> 
>> Could someone else please try to replicate this situation for me please?
>> 
>> This check is being run on FreeBSD 12.0-RELEASE-p2 with check_bacula from 
>> Bacula 9.2.2
>> 
>> Thank you.
>> 
> Have you tried with debug level increased, say using -d7 in the command line? 
> Maybe that would give a clue as to what it does prior to the segfault.
> 
This was interesting:


$ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d7
Segmentation fault

real    1m15.137s
user    0m0.000s
sys     0m0.004s

Note the time, similar to the first test.

Let's up the debug level

$ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d77
check_bacula: bsockcore.c:384-0 Could not connect to server File daemon 
tape02:9102. ERR=Operation timed out
check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on 
tape02:9102. ERR=Operation timed out
Segmentation fault

real    1m15.024s
user    0m0.000s
sys     0m0.004s


Ok debugging info.

Let's bump up:

$ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d777
check_bacula: bsockcore.c:299-0 Current 10.55.0.110:9102 All 10.55.0.110:9102
check_bacula: bsockcore.c:384-0 Could not connect to server File daemon 
tape02:9102. ERR=Host is down
check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on 
tape02:9102. ERR=Host is down
Segmentation fault

real    0m0.013s
user    0m0.000s
sys     0m0.004s


What: zero time?

Further tests with -d777, -d77, and -d7 all finished in near-zero time.  I 
tried again without -d, and it failed in 75 seconds, like the others.

Try again with -d7777 (a larger value than previously tried), near-zero time 
again.

Then I waited and tried again.

[dan@webserver:/usr/local/libexec/nagios] $ time ./check_bacula -H tape02 -D fd 
-M nagios-mon -K '[redacted' -d7777
check_bacula: bsockcore.c:299-0 Current 10.55.0.110:9102 All 10.55.0.110:9102
check_bacula: bsockcore.c:384-0 Could not connect to server File daemon 
tape02:9102. ERR=Operation timed out
check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on 
tape02:9102. ERR=Operation timed out
check_bacula: watchdog.c:82-0 Initialising NicB-hacked watchdog thread
check_bacula: watchdog.c:197-0 Registered watchdog 800c38098, interval 300 one 
shot
check_bacula: watchdog.c:254-0 NicB-reworked watchdog thread entered
check_bacula: watchdog.c:296-0 pthread_cond_timedwait 60
check_bacula: btimers.c:177-0 Start bsock timer 800c135e8 tid=800c15000 for 300 
secs at 1547849285
check_bacula: btimers.c:212-0 Stop bsock timer 800c135e8 tid=800c15000 at 
1547849285.
check_bacula: watchdog.c:217-0 Unregistered watchdog 800c38098
check_bacula: watchdog.c:296-0 pthread_cond_timedwait 60
Segmentation fault

real    1m15.035s
user    0m0.000s
sys     0m0.004s


--
Dan Langille - BSDCan / PGCon
d...@langille.org



Attachment: signature.asc
Description: Message signed with OpenPGP

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to