On Mon, Nov 7, 2011 at 1:41 PM, Ben Pfaff <b...@nicira.com> wrote: > On Mon, Nov 07, 2011 at 11:18:04AM -0800, Jesse Gross wrote: >> On Mon, Nov 7, 2011 at 9:24 AM, Ben Pfaff <b...@nicira.com> wrote: >> > On Sun, Nov 06, 2011 at 09:56:10PM -0800, Jesse Gross wrote: >> >> On Fri, Nov 4, 2011 at 4:43 PM, Ben Pfaff <b...@nicira.com> wrote: >> >> > NetFlow active timeouts were only mixed in with flow expiration for >> >> > convenience: both processes need to iterate all the facets. ??But >> >> > an upcoming commit will change flow expiration to work in terms of >> >> > a new "subfacet" entity, so they will no longer fit together well. >> >> > >> >> > This change could be seen as an optimization, since NetFlow active >> >> > timeouts don't ordinarily have to run as often as flow expiration, >> >> > especially when the flow expiration rate is stepped up due to a >> >> > large volume of flows. >> >> >> >> This has a pretty significant effect on the accuracy of the timeouts >> >> that I'm not sure is intended. ??Currently, active timeouts are done on >> >> a per-flow basis starting from time of first use. ??However, this >> >> essentially starts a per-bridge timer on first configuration that must >> >> first expire in order to check the per-flow timer. ??So with the >> >> default timeout of 10 minutes, the first active timeout will occur >> >> somewhere between 10 and 20 minutes after first use. ??This only >> >> happens for the first one though since they will tend to synchronize. >> >> However, I think that there is a potential for the two timers to >> >> desynchronize, resulting in apparently random doubling of intervals. >> >> For example, netflow_run() is also called from gen_netflow_rec() when >> >> it fills up a packet but does not check the return code, skipping the >> >> active timeout if a timer tick occurred in that window. ??Finally, the >> >> current active timeout code distributes reporting over a large span of >> >> time but this concentrates all of them at once, which could cause a >> >> load spike in the collector if a number of switches are brought up at >> >> the same time. >> > >> > Hmm. >> > >> > Maybe I should just do NetFlow reporting once a second (as it was >> > before). ??What do you think? >> >> I think either that or actually tracking when the next timeout will >> occur are the only real solutions. However, I think the only >> efficient way to do correct timeouts is to again combine this with the >> flow expiration code, which gets us back to where we were before. > > I don't understand. NetFlow active timeouts are essentially > independent of flow expiration, except to the extent that if a flow > expires then it doesn't need active timeouts.
Sorry, when I said correct timeouts I meant calculating the timeout for the next flow. I think this needs to be integrated with flow expiration because if a flow expires from inactivity then you have to check whether it was the cause of the next active timeout interval and if so calculate a new one. >> When you say do reporting once a second do you mean essentially the >> same as in this patch but use 1 second instead of the active timeout >> interval or go back to the original version? > > The same as in this patch but go back to 1 second, which is the > minimum rate at which we call the main "expire()" function in > ofproto-dpif.c that actually runs the loop above. So that doubles the number of times that we are iterating over the facets. Do you think that will be a problem for large numbers of flows? _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev