----- Original Message ----- > From: "Bill Woodcock" <wo...@pch.net>
> On Jul 28, 2014, at 9:28 AM, William Herrin <b...@herrin.us> wrote: > > The data set suffers three flaws: > > Depending on your point of view, a lot more than three, undoubtedly. > > > 1. It is not representative of the actual traffic flows on the > > Internet. > > There are an infinite number of things it’s not representative of, but > it also doesn’t claim to be representative of them. Traffic flows on > the Internet is a different survey of a different thing, but if > someone can figure out how to do it well, I would be very supportive > of their effort. It's a _much_ more difficult survey to do, since it > requires getting people to pony up their unanonymized netflow data, > which they’re a lot less likely to do, en masse, than their peering > data. We’ve been trying to figure out a way to do it on a large and > representative enough scale to matter for twenty years, without too > much headway. The larger the Internet gets, the more difficult it is > to survey well, so the problem gets harder with time, rather than > easier. I think you're over-specifizing Bill's assertion, Woody. He didn't mean "TCP Flows", I don't think; he was simply -- as I understood him -- talking about the 40,000ft view of connections between pieces of the Internet. I don't expect your dataset to have flow-level data, and I don't think he did either; it isn't really germane to the conversation we're having. Cheers, -- jra -- Jay R. Ashworth Baylink j...@baylink.com Designer The Things I Think RFC 2100 Ashworth & Associates http://www.bcp38.info 2000 Land Rover DII St Petersburg FL USA BCP38: Ask For It By Name! +1 727 647 1274