Hi Paolo,

> Posed I'm no expert of RouterOS; if it has a NetFlow export process,
> can you check if it pegs at 100% CPU? Or if anything suspicious emerges
> from the router logs?

The netflow process runs at 0.1-0.2% CPU (on a 36core router).
Unfortunately RouterOS' netflow options and stats are very basic and the
logs do not include anything related to it :(
But there isn't too much traffic (and the netflow traffic is at 50-150pps)

> On the nfacctd side, if logs are clean then it should mean internal
> buffering is OK. Still, better to double-check buffering between the
> kernel and nfacctd. At this propo, can you please follow notes in
> section D of chapter XXI of a recent pmacct QUICKSTART guide ( see
> https://github.com/paololucente/pmacct/blob/master/pmacct/QUICKSTART ),
> essentially to check if there is any UDP drops?

Today I installed a new server just for nfacctd running the latest
debian with MySQL 5.7.
I've got about 2hours of stats so far and they seem to coincide with
Observium's stats. Though I noticed there is a 5 minute skew in what
nfacctd and Observium show for the same time frame. I guess it's because
nfacctd will insert the flow in the appropriate time-bin based on it's
timestamps while the cron script of observium will log the octects from
snmp at the time it runs (which maybe 1-2 minutes after the 5 minute
mark since it collects data from many ports on each run).

The good thing is that regardless of that time skew between the two
systems, the amount of data measured stay in par as time goes by.
I need to keep an eye on it to make sure that's the case since I've
noticed the previous days that during the day when traffic is at its
peak the discrepancies increase.

By the way, using MySQL 5.7, nfacctd caused errors from time to time
complaining that the vlan or tos fields do not have a default value
(this was the sql error).
I simply altered those fields to have a default value and now those
errors stopped. Maybe this is a bug caused by MySQL 5.7 and current
pmacct's default db schema (I believe it is not related to my current
issue since on the old server I have MySQL 5.5, just mentioning it).

Now, checking the udp drop counters on the old server, indeed I see some
25000+ drops. That counter seem to increase during the refresh time of
the sql plugin. Not always though. Is there a connection between the
drops and the mysql insert/update process? If so, would running the
mysql server on a different server eliminate any future possibility of
that happening again?

I don't see any drops on the new server, so that's a good thing, and
that may account for the fact that it seems to count the totals properly
(I certainly hope so!)

I added the nfacctd_pipe_size and modified the rmem_default & rmem_max
as suggested in the FAQ (silly me, I didn't read it all the way to the
end!) but I still see the drops counter increase.
But if the new server works OK as it is, I don't really care if the old
one has drops (for any reason).

Also I couldn't find any documenation for this config parameter on the
Official Config Keys wiki page. http://wiki.pmacct.net/OfficialConfigKeys

> Finally, i see sql_refresh_time and sql_history are set to different
> values - meaning SQL UPDATE queries are involved; this is OK as long
> as the actual database does not suffer from them; can you check that
> SQL writer processes are not piling up? This can be done with a simple
> "ps auxw | grep nfacctd".

I've set them that low to troubleshoot the problem (check the new data
in 1 minute intervals in the database).
Watching the insert/update queries fly by the terminal during debug
mode, it takes about 10seconds to finish - without any errors, so it
doesn't seem to be any issue there.


I'll keep an eye on the new setup to see how it goes and I'll keep you
posted if the issue persists.

Thank you for your help :)

Cheers,
Vaggelis.

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to