Hello,

I am using nfacct with Mikrotik RouterOS to account for the traffic our
clients do each month.
I do aggregation per IP to get the total bytes for each IP for all our
prefixes.

nfacct seems to be working fine with Mikrotik (it receives the flows
without any errors when running in debug mode).
The problem I encounter is that there are significant discrepancies
between what nfacct counts and what other tools count.

I compare the nfacct results with Solarwinds (netflow) and Observium (SNMP).
I understand that SNMP will show different numbers since it counts the
switch ports octets including the ethernet overhead data etc (I've
included a 26bytes adjb on my nfacct config though to account for that
as per pmacct FAQ).
But even between 2 netflow collectors the data are different.

Actually even between 2 different databases of nfacct data (using the
same nfacct instance) the data are not consistent.

For example for today (27-11-2015) until the time of this writing, all 4
implementations have different values.

-------
Observium/SNMP:
Total IN: 69.12GB
Total OUT: 318.22GB

Solarwinds/Netflow:
Total IN: 60.4GB
Total OUT: 315GB

nfacct (history 1d, refresh 60):
Total IN: 69.20GB
Total OUT: 302.74GB

nfacct (history 5m, refresh 60):
Total IN: 68.44GB
Total OUT: 300.04GB
-------

The above (nfacct) numbers where calculated using standard SQL queries
such as:
SELECT (
    SELECT concat(truncate((sum(bytes)/1024/1024/1024),2), 'GB') as bytes
    FROM netflow
    WHERE ip_dst = '0.0.0.0' AND stamp_inserted = '2015-11-27 00:00:00'
) as total_out, (
    SELECT concat(truncate((sum(bytes)/1024/1024/1024),2), 'GB') as bytes
    FROM netflow
    WHERE ip_src = '0.0.0.0' AND stamp_inserted = '2015-11-27 00:00:00'
) as total_in

So which of the above are the "correct" values?
Since our datacenter charges us based on their SNMP counters on our
uplink ports, and since we have crosschecked their measurements with
ours (observium) and are the exact same, I take the SNMP/Observium
results as my comparison baseline.

I've been beating myself for the last 2 weeks trying to figure out
what's causing those skewed numbers.
On my lab where the traffic is controlled during tests I can do file
transfers and account every last byte without any discrepancies.
But when running the same config on the production site, I never get
consistent data (but there is also way more traffic and more IPs
generating that traffic)


Here is my nfacct config:

------
daemonize: true
pidfile: /var/run/nfacctd.pid
sql_db: pmacct
sql_host: localhost
sql_user: *****
sql_passwd: *****
nfacctd_port: 2055

plugin_pipe_size: 16384000
plugin_buffer_size: 16384

# 5min time-bins
aggregate[total_in]: dst_host
aggregate[total_out]: src_host
aggregate_filter[total_in]: dst net 2a00:xxxx:xxxx::/48 or dst net
31.xx.xx.0/21 or dst net 185.xx.xx.0/22 or dst net 62.xx.xx.0/24 or dst
net 194.xx.xx.0/24
aggregate_filter[total_out]: src net 2a00:xxx:xxx::/48 or src net
31.xx.xx.0/21 or src net 185.xx.xx.0/22 or src net 62.xx.xx.0/24 or src
net 194.xx.xx.0/24
sql_table[total_in]: traffic
sql_table[total_out]: traffic
sql_refresh_time[total_in]: 60
sql_refresh_time[total_out]: 60
sql_history[total_in]: 5m
sql_history[total_out]: 5m
sql_history_roundoff[total_in]: mh
sql_history_roundoff[total_out]: mh
sql_table_version[total_in]: 4
sql_table_version[total_out]: 4
sql_preprocess[total_in]: adjb=+26
sql_preprocess[total_out]: adjb=+26


# daily time-bins
aggregate[daily_in]: dst_host
aggregate[daily_out]: src_host
aggregate_filter[daily_in]: dst net 2a00:xxxx:xxxx::/48 or dst net
31.xx.xx.0/21 or dst net 185.xx.xx.0/22 or dst net 62.xx.xx.0/24 or dst
net 194.xx.xx.0/24
aggregate_filter[daily_out]: src net 2a00:xxx:xxx::/48 or src net
31.xx.xx.0/21 or src net 185.xx.xx.0/22 or src net 62.xx.xx.0/24 or src
net 194.xx.xx.0/24
sql_table[daily_in]: traffic_daily
sql_table[daily_out]: traffic_daily
sql_refresh_time[daily_in]: 60
sql_refresh_time[daily_out]: 60
sql_history[daily_in]: 1d
sql_history[daily_out]: 1d
sql_history_roundoff[daily_in]: mh
sql_history_roundoff[daily_out]: mh
sql_table_version[daily_in]: 4
sql_table_version[daily_out]: 4
sql_preprocess[daily_in]: adjb=+26
sql_preprocess[daily_out]: adjb=+26


plugins: mysql[total_in], mysql[total_out], mysql[daily_in],
mysql[daily_out]
------

And here's my Mikrotik Traffic Flow (netflow) configuration:

------
/ip traffic-flow
set active-flow-timeout=1m cache-entries=1k enabled=yes interfaces=sfp1
/ip traffic-flow target
add dst-address=X.X.X.X v9-template-refresh=60 v9-template-timeout=1m
------


Can anyone think of a reason I get such inconsistent results? Is there
something I miss?
Let me know if you need any further information.

Thanks.


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to