Hi Oliver, > I've done a bit more looking around and found the following > interesting bits > of information. From http://www.sflow.org/about/index.php : > > "Usage accounting for billing and charge-back" > > which seems to suggest 100% accurate representation of bandwidth > consumption, if used for billing. But then in > http://www.sflow.org/sflow_version_5.txt : > > "Packet Flow Sampling: Packet Flow Sampling refers to the random > selection of a fraction of the Packet Flows observed at a Data > Source." > > which suggests exactly what you are saying - it only samples a > fraction of > the actual traffic. But then later in the same page:
The same time you decide to sample you loose precision, no matter what. Also, you will loose detail when you apply aggregation. The point is how much precision do you bother to loose. Dumb samplig just consider all packets the same, no matter if they are 1K or 10K. Intelligent sampling considers the size (the provability of being sampled is influenced by size). > So it seems indeed if you set the sampling rate to be 1, it would > sample > every single packet. To be honest I can't understand why sampling a > fraction > of the packets would be useful at all, apart from gleaning a rough > understanding of the relationship between the flows. However this > fractional > sampling leads to data loss and as I mentioned in my first post, the > backchannel with a very small fraction of the total traffic was not > reported > at all. The need for sampling surged from different constrains: 1) NF and sF were born to be used in switches and routers, usually with low CPU power. Doing 100% analysis was completelly impossible for those CPU. 2) Link usage. If you dont sample, you consume more link bandwidth 3) Storage resources. Have you even considered the storage requirements you are going to need? Storing all data is provably going to kill your server unless the link is not important or you hava all NSA computers at hand Currently both aggregation and sampling are applied both in the probe and the server itself. The point is how much precision / detail you loose in exchange of fast analysis and great interfacew. Some very interesting articles refer to this and provide a real strong mathematical foundation. Again, for a ADSL line this is just stupid, but when you start to talk serious, things get really nasty. To be honest, Paolo has done a great job with pmacct in this fields. We expect to help him in the very near future to improve it even further with ideas of our own. > Examining the header of each packet will allow the total data > throughput to > be determined without using the payload at all, and at reasonably low > cost... surely sFlow can do this? Sorry, I understood you wrongly, I though you were analysing full payload for something, my fault Still, all I have said is related to geader only information, so still is valid :) > Hopefully someone on list has set something similar up and can point > me in > the right direction. The main point here is, how fast is your link, and what are the specs of the probe and/or collector? Regards -------------------------------------------- Jaime Nebrera - [EMAIL PROTECTED] Consultor TI - ENEO Tecnologia SL Pol. PISA - C/ Manufactura 6, P1, 3B Mairena del Aljarafe - 41927 - Sevilla Telf.- (+34) 955 60 11 60 / 619 04 55 18 _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
