> On Jan 18, 2019, at 11:24 AM, Josh Fisher <jfis...@pvct.com> wrote: > > On 1/17/2019 12:17 PM, Dan Langille wrote: > >> I was noticing this in my /var/log/messages: >> >> Jan 17 17:04:00 slocum kernel: pid 52623 (check_bacula), uid 181: exited on >> signal 11 >> Jan 17 17:04:21 slocum kernel: pid 53805 (check_bacula), uid 181: exited on >> signal 11 >> >> I tracked it down to a host which was not up: >> >> $ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' >> Segmentation fault >> >> real 1m15.101s >> user 0m0.006s >> sys 0m0.000s >> >> >> Could someone else please try to replicate this situation for me please? >> >> This check is being run on FreeBSD 12.0-RELEASE-p2 with check_bacula from >> Bacula 9.2.2 >> >> Thank you. >> > Have you tried with debug level increased, say using -d7 in the command line? > Maybe that would give a clue as to what it does prior to the segfault. > This was interesting:
$ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d7 Segmentation fault real 1m15.137s user 0m0.000s sys 0m0.004s Note the time, similar to the first test. Let's up the debug level $ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d77 check_bacula: bsockcore.c:384-0 Could not connect to server File daemon tape02:9102. ERR=Operation timed out check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on tape02:9102. ERR=Operation timed out Segmentation fault real 1m15.024s user 0m0.000s sys 0m0.004s Ok debugging info. Let's bump up: $ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted]' -d777 check_bacula: bsockcore.c:299-0 Current 10.55.0.110:9102 All 10.55.0.110:9102 check_bacula: bsockcore.c:384-0 Could not connect to server File daemon tape02:9102. ERR=Host is down check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on tape02:9102. ERR=Host is down Segmentation fault real 0m0.013s user 0m0.000s sys 0m0.004s What: zero time? Further tests with -d777, -d77, and -d7 all finished in near-zero time. I tried again without -d, and it failed in 75 seconds, like the others. Try again with -d7777 (a larger value than previously tried), near-zero time again. Then I waited and tried again. [dan@webserver:/usr/local/libexec/nagios] $ time ./check_bacula -H tape02 -D fd -M nagios-mon -K '[redacted' -d7777 check_bacula: bsockcore.c:299-0 Current 10.55.0.110:9102 All 10.55.0.110:9102 check_bacula: bsockcore.c:384-0 Could not connect to server File daemon tape02:9102. ERR=Operation timed out check_bacula: bsockcore.c:197-0 Unable to connect to File daemon on tape02:9102. ERR=Operation timed out check_bacula: watchdog.c:82-0 Initialising NicB-hacked watchdog thread check_bacula: watchdog.c:197-0 Registered watchdog 800c38098, interval 300 one shot check_bacula: watchdog.c:254-0 NicB-reworked watchdog thread entered check_bacula: watchdog.c:296-0 pthread_cond_timedwait 60 check_bacula: btimers.c:177-0 Start bsock timer 800c135e8 tid=800c15000 for 300 secs at 1547849285 check_bacula: btimers.c:212-0 Stop bsock timer 800c135e8 tid=800c15000 at 1547849285. check_bacula: watchdog.c:217-0 Unregistered watchdog 800c38098 check_bacula: watchdog.c:296-0 pthread_cond_timedwait 60 Segmentation fault real 1m15.035s user 0m0.000s sys 0m0.004s -- Dan Langille - BSDCan / PGCon d...@langille.org
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users