Hi Edward,

First, thanks for this exaustive email, very interesting. My first question
to scope it better is whether you are using any sampling rate, and if yes how
much. I ask because i'd intuitively say if a flow is created from a single
sampled packet (which gets typical on most traffic, not all ie. long-lived
video, on big sampling rates ie. 1:10k) then one cannot/shouldn't pro-rate.

Another part to it is that a generic-enough pro-rating algorithm should
really work across multiple time-bins in the past. Now that print plugin
supports appending to existing files (1.5.0rc1) and determine which file
to write to depending on timestamp (if newly introduced file_history is
specified) i could code something around it and we can pilot it to see
whether results are according to expectation. Essentially, you find me
positive about your point. I get back soon in touch with you privately
about this - does it sound like a good way forward? If anybody else is
interested into this and would like to give it a try, just let me know.

Cheers,
Paolo

On Fri, Aug 30, 2013 at 11:26:08AM -0500, Edward Henigin wrote:
> Hello Paolo,
> 
> I'm currently using nfacctd to capture netflow accounting, for the purpose
> of identifying unexpected high traffic flows. (I happen to be using the
> print plugin, and parsing the text files to generate web reports, etc.) The
> netflow exporter in this case is a Cisco RSP720.
> 
> One thing I notice is that the accounting data ends up being "bursty." It
> makes sense to me that this is a natural result of the netflow accounting
> architecture on the RSP720, with the fact that flows can be expired at any
> time. In the beginning of a 300-second window, flows may be expired and
> exported which primarily covered the preceding 300-second period, and at
> the end of the current window, all active flows may suddenly be expired and
> exported, causing a "spike" in reported traffic. This actually seems to
> happen quite a bit, here's a random sample of total packets/sec and
> bits/sec for a network segment where the traffic levels are actually
> relatively stable:
> 
>   Time Ending Total Kpps Total Mbps  8/30/2013 11:01:33 941 6736  8/30/2013
> 11:00:29 415 2941  8/30/2013 10:59:25 1115 7865  8/30/2013 10:58:21 1229
> 9193  8/30/2013 10:57:17 127 420  8/30/2013 10:56:13 1313 9412  8/30/2013
> 10:55:09 946 6934
> 
> (NB the above is using 64-second mls aging and print refresh time, but the
> concept stands regardless of interval length)
> 
> So the reason I'm writing is because in a previous life, I used a different
> netflow collector which simply dumped the netflow records to a flat file,
> and I wrote the scripts to aggregate the data. I saw the same burstiness in
> traffic rates due to the nature of netflow. At that time, I employed a
> strategy which seemed to do a very good job of smoothing out the
> burstiness. What I did was to pro-rate the byte & packet counts across time
> intervals.
> 
> So for example, if we receive a netflow accounting record, duration 240
> seconds, at 00:06:00, then I would count 1/4 of the packets & bytes to the
> current interval (05:00 - 09:59) and 3/4 of the packets & bytes to the
> prior interval (00:00 - 04:59).
> 
> A downside is that you only get "half" of the data for the current
> interval, so full reporting for any given interval is delayed by 1x
> interval length.
> 
> I'm interested in applying the pro-rating algorithm to nfacctd. I have no
> idea how I would do that in the code.
> 
> Paolo, I'm curious your thoughts in this regard.
> 
> Thanks,
> 
> Ed

> _______________________________________________
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists


_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to