I upgraded our production Icinga2 box a few days ago from 2.4.8 to
2.5.4 and, whether a result of the upgrade or something else, I've
been seeing the service getting stuck upon a "service icinga2 reload".

[root@ec2- icinga101 conf.d]$ service icinga2 status
Icinga 2 status: Not running

When this happens a process list (ps) looks like this:

root       643 32328  0 10:48 ?        00:00:00 /bin/sh /sbin/service
icinga2 reload
root       650   643  0 10:48 ?        00:00:00 /bin/sh
/media/ephemeral0/icinga2/lib/icinga2/safe-reload
/media/ephemeral0/icinga2/etc/sysconfig/icinga2
icinga     701   650 99 10:48 ?        00:02:20
/media/ephemeral0/icinga2/lib64/icinga2/sbin/icinga2 --no-stack-rlimit
daemon --validate --color

None of the service checks show up in the process list. On the Icinga
console under "System > Monitoring Health": "Backend Icinga2 is not
running".

The error.log is empty.
There are no logs in var/log/icinga2/crash/

The last entry in icinga.log is:
[2016-11-11 08:53:59 -0500] information/IdoMysqlConnection: Query
queue items: 0, query rate: 58.8833/s (3533/min 12360/5min
47871/15min);
[2016-11-11 08:54:11 -0500] information/Application: Received request
to shut down.
[2016-11-11 08:54:11 -0500] information/Application: Shutting down...
[2016-11-11 08:54:11 -0500] information/CheckerComponent: Checker stopped.

***
We have a salt state that does "service icinga2 reload" every 10
minutes. When this happens icinga2.log normally shows this:
[2016-11-11 05:07:48 -0500] information/Application: Received request
to shut down.
[2016-11-11 05:07:48 -0500] information/Application: Shutting down...
[2016-11-11 05:07:48 -0500] information/CheckerComponent: Checker stopped.
[2016-11-11 05:07:53 -0500] information/ConfigItem: Committing config item(s).
[2016-11-11 05:07:53 -0500] information/ConfigItem: Activated all objects.
[2016-11-11 05:07:53 -0500] information/DbConnection: Resuming IDO
connection: ido-mysql

But as shown above, a handful of times in the past few days the reload
gets stuck.

I definitely need help getting this fixed. This is our production
monitoring box, we have already migrated most of our important checks
from Nagios to Icinga2.

-- 
---
Michael Martinez
http://www.michael--martinez.com
_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to