Hi Paolo, > Posed I'm no expert of RouterOS; if it has a NetFlow export process, > can you check if it pegs at 100% CPU? Or if anything suspicious emerges > from the router logs?
The netflow process runs at 0.1-0.2% CPU (on a 36core router). Unfortunately RouterOS' netflow options and stats are very basic and the logs do not include anything related to it :( But there isn't too much traffic (and the netflow traffic is at 50-150pps) > On the nfacctd side, if logs are clean then it should mean internal > buffering is OK. Still, better to double-check buffering between the > kernel and nfacctd. At this propo, can you please follow notes in > section D of chapter XXI of a recent pmacct QUICKSTART guide ( see > https://github.com/paololucente/pmacct/blob/master/pmacct/QUICKSTART ), > essentially to check if there is any UDP drops? Today I installed a new server just for nfacctd running the latest debian with MySQL 5.7. I've got about 2hours of stats so far and they seem to coincide with Observium's stats. Though I noticed there is a 5 minute skew in what nfacctd and Observium show for the same time frame. I guess it's because nfacctd will insert the flow in the appropriate time-bin based on it's timestamps while the cron script of observium will log the octects from snmp at the time it runs (which maybe 1-2 minutes after the 5 minute mark since it collects data from many ports on each run). The good thing is that regardless of that time skew between the two systems, the amount of data measured stay in par as time goes by. I need to keep an eye on it to make sure that's the case since I've noticed the previous days that during the day when traffic is at its peak the discrepancies increase. By the way, using MySQL 5.7, nfacctd caused errors from time to time complaining that the vlan or tos fields do not have a default value (this was the sql error). I simply altered those fields to have a default value and now those errors stopped. Maybe this is a bug caused by MySQL 5.7 and current pmacct's default db schema (I believe it is not related to my current issue since on the old server I have MySQL 5.5, just mentioning it). Now, checking the udp drop counters on the old server, indeed I see some 25000+ drops. That counter seem to increase during the refresh time of the sql plugin. Not always though. Is there a connection between the drops and the mysql insert/update process? If so, would running the mysql server on a different server eliminate any future possibility of that happening again? I don't see any drops on the new server, so that's a good thing, and that may account for the fact that it seems to count the totals properly (I certainly hope so!) I added the nfacctd_pipe_size and modified the rmem_default & rmem_max as suggested in the FAQ (silly me, I didn't read it all the way to the end!) but I still see the drops counter increase. But if the new server works OK as it is, I don't really care if the old one has drops (for any reason). Also I couldn't find any documenation for this config parameter on the Official Config Keys wiki page. http://wiki.pmacct.net/OfficialConfigKeys > Finally, i see sql_refresh_time and sql_history are set to different > values - meaning SQL UPDATE queries are involved; this is OK as long > as the actual database does not suffer from them; can you check that > SQL writer processes are not piling up? This can be done with a simple > "ps auxw | grep nfacctd". I've set them that low to troubleshoot the problem (check the new data in 1 minute intervals in the database). Watching the insert/update queries fly by the terminal during debug mode, it takes about 10seconds to finish - without any errors, so it doesn't seem to be any issue there. I'll keep an eye on the new setup to see how it goes and I'll keep you posted if the issue persists. Thank you for your help :) Cheers, Vaggelis. _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
