Re: [pmacct-discussion] pmacct weird counters

Chris Wilson Sat, 14 Mar 2009 10:54:27 -0700

Hi Paolo,

On Sat, 14 Mar 2009, Paolo Lucente wrote:


> About the SQL INSERT conflict, are you by any chance making use of the
> "sql_dont_try_update" directive in your configuration?

Yes I am, because it's much more efficient.

> And are you using 32bit counters?

I think so, yes. I compiled with default options on a 32-bit host.

> The conjunction of these two conditions might explain.
>
> The SQL cache code, while summing up counters, makes a check on whether
> the counter field is about to overflow. When 64bit counters are disabled
> (default) this is what happens:
>
> #define UINT32T_THRESHOLD 4290000000UL
> #define CACHE_THRESHOLD UINT32T_THRESHOLD
>
> /* additional check: bytes counter overflow */
> else if (Cursor->bytes_counter > CACHE_THRESHOLD) {
>  if (!staleElem && Cursor->chained) staleElem = Cursor;
>  goto follow_chain;
> }
>
> Basically, a new record for the entry which is going to overflow is
> opened and the old one if "parked". When purging the cache to the SQL
> database, both records (the active and the parked one) are sent over;
> the first with an INSERT the second with an UPDATE. This mechanism is
> valid for any number of overflows - indeed.
>
> The above would also explain why a number of the entries above the 1GB
> level are around the 4GB. But this also would suggest the counters are
> genuine. Another thing which would suggest these are "real" is that by
> dividing the bytes counter by the packets counter, you get a consistent
> average data size:
>
> 4290000028 / 10026264 = ~428 bytes
> 3943258731 / 8984686  = ~439 bytes
>
> Any bytes counter roll-over would have greatly skewed one of the above
> two proportions - highlighting an issue. But this would suggest that in
> a single minute roughly 8GB of data were transferred. This translates in
> a fully loaded 1Gbps link. This brings me to these questions: is your LAN
> network (including the "192.168.0.175" host) connected to 1Gbps? Do you
> think it could be possible some LAN traffic gets spanned over?

The local machine is connected to a gigabit switch on the LAN, but this 
host is attached to another switch which is not gigabit, so that suggests 
to me that the counter is invalid. I just checked on the switch, and the 
port that this machine is attached to is currently running at 100mbps.

It is possible that either the switch or my firewall/router/pmacct box is 
going mental and repeating traffic.

Perhaps the best thing to do is to recompile pmacct with 64-bit counters 
to see if the issue goes away? Alternatively I planned to log all traffic 
with tcpdump -w to create a pcap file that I could replay into pmacctd to 
reproduce the problem if it happens again. Would that work? Does pmacctd 
honour the timestamps in the pcap file while reading it?

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] pmacct weird counters

Reply via email to