I've been using RRDTool for years on a linux box (redhat). I collect various house sensor data. However, that system just went belly up, so I moved the collection applications to a mac mini. I used a bundled version of rrdtool 1.4.5 that had various other libraries collected to make the mac build work statically linked together. With minor build problems, I have rrdtool working.
I have two C apps that receive transmissions from my (temperature) sensors in the house and log to a GAUGE rrd. The databases are on a linux server that the mac accesses via NFS. This was the same networked file system setup when I used the linux box that died. Frequently, the two apps (using different radio receivers) will receive the same sensor "report" and (nearly) simultaneously attempt to rrdupdate (via C call) the value for that time. One of them will typically get either a "could not lock" or "illegal attempt to update using time" rrd error. Which is fine in my case. I just ignore those errors, do a rrd_clear_error and go on. I use the two apps (each with its own receiver) due to the range of the sensor transmitters. With only one receiver, I don't always get all the sensor reports. So I can't just use one receiver/app long term. However, on my old linux box, data would get stored just fine. On the mac mini, I see sporadic cases where the data that is stored is not the data that is retrieved. For example, temperatures might be in the 65 degree range and a value in the 15 degree range is present. When I print out the actual rrdupdate strings that I use during running of the apps, I only see the normal expected temperatures being rrdupdated. But when I do an rrdfetch or display a graph, I find these "out of expected range" values. Several values a day, but usually hours apart for a given sensor. I log different sensors to different rrds and see these errors on different rrds. I get maybe a dozen such errors across all the sensors during a day. If I only let one of the two logging apps run, I don't see these "glitches". So I suspect some sort of locking problem is happening. I am using the same app code, same sensors/receivers, same rrd data bases. I didn't re-create the rrds, just used the ones that I have been using. But the platform is different (mac os snow leopard vs. redhat), different rrdtool versions 1.4.5 vs. some older version. I don't know what older rrdtool version due to the redhat system's disk failure. That system had been running for years with no rrdtool problems and I don't recall what the rrdtool version was that I used. But I believe it was a binary install via yum. I looked at the source and don't see any obvious errors in the fcntl call that does the rrd_lock. I tried (on a whim) changing the fcntl to instead do a flock(), but no change. I verified that I am using the rrd_lock() code clause that isn't the one that uses _locking(). I also verified that I am not using the _rrd_update() code clause(s) that depend on HAVE_MMAP. Any ideas of what to look for or try to eliminate this problem? This makes it really hard to get useful/reliable data that I have come to depend upon! Thanks! -- View this message in context: http://rrd-mailinglists.937164.n2.nabble.com/rrdupdate-corruption-on-Mac-Snow-Leopard-tp7564325.html Sent from the RRDtool Users Mailinglist mailing list archive at Nabble.com. _______________________________________________ rrd-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
