Hi, I'm collecting data every 120 seconds and placing it into an RRD. My heartbeats are set to 240 seconds. After a few hours, my AVERAGE RRA starts returning unknowns for a time period it had previously returned valid data. My other RRAs (MAX, LAST, etc.) return valid data for the same time period.
I first noticed this as gaps in my graphs, and started monitoring them regularly as a result, suspecting my data collection scripts were returning no data during those time periods. Well, I've been monitoring them and the gap I see right now for a 2 hour period 16 hours ago did not exist until a few hours ago! It's as if the RRD file becomes corrupt after a while, and old data that was previously valid starts coming out as unknown. Since noticing this I started logging my rrdupdate commands to a text file to see what data is entering the RRD at what times. The data entries are all valid, and never outside of a 120 second window by more than a second. Here's some rrdfetch output: $ rrdtool fetch health.rrd AVERAGE -s -52000 -e -50000 1333311120: nan nan nan nan 1333311240: nan nan nan nan 1333311360: nan nan nan nan 1333311480: nan nan nan nan 1333311600: nan nan nan nan 1333311720: nan nan nan nan 1333311840: nan nan nan nan 1333311960: nan nan nan nan 1333312080: nan nan 3.0034230333e+00 6.0000000000e+00 1333312200: 1.3400000000e+01 2.7000000000e+01 3.9933606000e+00 6.0000000000e+00 1333312320: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00 6.0000000000e+00 1333312440: 1.3400000000e+01 2.7000000000e+01 3.0045041083e+00 6.0000000000e+00 1333312560: 1.3400000000e+01 2.7000000000e+01 3.9936468250e+00 6.0000000000e+00 1333312680: 1.3400000000e+01 2.7000000000e+01 4.9911438667e+00 6.0000000000e+00 1333312800: 1.3400000000e+01 2.7000000000e+01 5.0000000000e+00 6.0000000000e+00 1333312920: 1.3400000000e+01 2.7000000000e+01 3.0124310667e+00 6.0000000000e+00 1333313040: 1.3400000000e+01 2.7000000000e+01 3.0000000000e+00 6.0000000000e+00 1333313160: 1.3400000000e+01 2.7000000000e+01 3.9892649250e+00 6.0000000000e+00 $ rrdtool fetch health.rrd LAST -s -52000 -e -50000 1333311120: 1.3400000000e+01 2.7000000000e+01 3.0101470417e+00 6.0000000000e+00 1333311240: 1.3400000000e+01 2.7000000000e+01 3.9927562500e+00 6.0000000000e+00 1333311360: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00 6.0000000000e+00 1333311480: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00 6.0000000000e+00 1333311600: 1.3400000000e+01 2.7000000000e+01 3.0112068833e+00 6.0000000000e+00 1333311720: 1.3400000000e+01 2.7000000000e+01 4.9888617833e+00 6.0000000000e+00 1333311840: 1.3499455991e+01 2.7000000000e+01 4.0054400917e+00 6.0000000000e+00 1333311960: 1.3400839001e+01 2.7000000000e+01 4.0000000000e+00 6.0000000000e+00 1333312080: 1.3400000000e+01 2.7000000000e+01 3.0034230333e+00 6.0000000000e+00 1333312200: 1.3400000000e+01 2.7000000000e+01 3.9933606000e+00 6.0000000000e+00 1333312320: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00 6.0000000000e+00 1333312440: 1.3400000000e+01 2.7000000000e+01 3.0045041083e+00 6.0000000000e+00 1333312560: 1.3400000000e+01 2.7000000000e+01 3.9936468250e+00 6.0000000000e+00 1333312680: 1.3400000000e+01 2.7000000000e+01 4.9911438667e+00 6.0000000000e+00 1333312800: 1.3400000000e+01 2.7000000000e+01 5.0000000000e+00 6.0000000000e+00 1333312920: 1.3400000000e+01 2.7000000000e+01 3.0124310667e+00 6.0000000000e+00 1333313040: 1.3400000000e+01 2.7000000000e+01 3.0000000000e+00 6.0000000000e+00 1333313160: 1.3400000000e+01 2.7000000000e+01 3.9892649250e+00 6.0000000000e+00 And here's some of my rrdupdate log: 1333311120 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:3:6 1333311241 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:4:6 1333311360 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:4:6 1333311481 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:4:6 1333311600 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:3:6 1333311720 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:5:6 1333311841 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.5:27.0:4:6 1333311960 rrdtool update health.rrd -t voltage:temperature:cpu:memory N:13.4:27.0:4:6 I'm no stranger to RRDtool, but this has stumped me. Any ideas? rrdtool 1.2.30 FreeBSD 8.2-RELEASE amd64 Thanks, Aragon _______________________________________________ rrd-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
