Hi,

My /opt/omd/sites/.../var/rrdcached directory is growing very fast.
At the moment it contains 151 files with a total of ~9GB.
Currently I am running version 0.56.
It looks like this problem exists since upgrading to 0.52.

Last week I tried to find the source of the problem and ended up deleting 
everything inside var/pnp4nagios/perfdata/ because I found out that there were 
some problems because the RRD_STORAGE_TYPE was changed to MULTIPLE and after 
spending some hours in trying to convert the old rrd-files I gave up and 
deleted the whole performance-data-history.

Now the Disk space is again critical and I have no idea what the problem could 
be!

We are monitoring about 4000 Services.

The var/pnp4nagios/log/perfdata.log shows nothing but timeouts:

#####
...
2012-10-22 16:25:29 [20877] [1] process_perfdata.pl-0.6.19 starting in BULK 
Mode called by NPCD
2012-10-22 16:25:29 [20877] [1] Found Performance Data for server1 / _HOST_ 
(rta=0.241ms;200.000;500.000;0; pl=0%;40;80;; rtmax=0.298ms;;;; 
rtmin=0.198ms;;;;) 
2012-10-22 16:25:29 [20879] [1] process_perfdata.pl-0.6.19 starting in BULK 
Mode called by NPCD
2012-10-22 16:25:29 [20879] [1] Found Performance Data for server2 / CPU_load 
(load1=8.13;20;40;0; load5=8.8;20;40;0; load15=9.12;20;40;0;) 
2012-10-22 16:25:44 [20877] [0] *** TIMEOUT: Timeout after 15 secs. ***
2012-10-22 16:25:44 [20877] [0] *** TIMEOUT: Deleting current file to avoid 
NPCD loops
2012-10-22 16:25:44 [20877] [0] *** TIMEOUT: Please check your 
process_perfdata.cfg
2012-10-22 16:25:44 [20877] [0] *** TIMEOUT: 
/omd/sites/emerion/var/pnp4nagios/spool//perfdata.1350915913-PID-20877 deleted
2012-10-22 16:25:44 [20877] [0] *** Timeout while processing Host: "server1" 
Service: "_HOST_"
2012-10-22 16:25:44 [20877] [0] *** process_perfdata.pl terminated on signal 
ALRM
...
#####

Can anyone tell me where I could find the root for the problem?

One thing I know is, that the server sometimes has a very high load and we are 
planing to move some services away from this machine, but even when I stop some 
resource-eating services only timeouts are showing up in the perfdata.log

Best regards,

Alex
_______________________________________________
omd-users mailing list
[email protected]
http://lists.mathias-kettner.de/mailman/listinfo/omd-users

Reply via email to