Re: help needed - state of california needs a benchmark - beware bufferbloat

Jim Gettys Mon, 31 Jan 2011 06:59:13 -0800

On 01/29/2011 01:00 PM, Mike wrote:

Hello,


My company is small clec / broadband provider serving rural communities
in northern California, and we are the recipient of a small grant from
the state thru our public utilities commission. We went out to 'middle
of nowhere' and deployed adsl2+ in fact (chalk one up for the good
guys!), and now that we're done, our state puc wants to gather
performance data to evaluate the result of our project and ensure we
delivered what we said we were going to. Bigger picture, our state is
actively attempting to map broadband availability and service levels
available and this data will factor into this overall picture, to be
used for future grant/loan programs and other support mechanisms, so
this really is going to touch every provider who serves end users in the
state.

The rub is, that they want to legislate that web based 'speedtest.com'
is the ONLY and MOST AUTHORITATIVE metric that trumps all other
considerations and that the provider is %100 at fault and responsible
for making fraudulent claims if speedtest.com doesn't agree. No
discussion is allowed or permitted about sync rates, packet loss,
internet congestion, provider route diversity, end user computer
performance problems, far end congestion issues, far end server issues
or cpu loading, latency/rtt, or the like. They are going to decide that
the quality of any provider service, is solely and exclusively resting
on the numbers returned from 'speedtest.com' alone, period.

All of you in this audience, I think, probably immediately understand
the various problems with such an assertion. Its one of these situations
where - to the uninitiated - it SEEMS LIKE this is the right way to do
this, and it SEEMS LIKE there's some validity to whats going on - but in
practice, we engineering types know it's a far different animal and
should not be used for real live benchmarking of any kind where there is
a demand for statistical validity.

My feeling is that - if there is a need for the state to do
benchmarking, then it outta be using statistically significant
methodologies for same along the same lines as any other benchmark or
test done by other government agencies and national standards bodies
that are reproducible and dependable. The question is, as a hotbutton
issue, how do we go about getting 'the message' across, how do we go
about engineering something that could be considered statistically
relevant, and most importantly, how do we get this to be accepted by
non-technical legislators and regulators?


Mike,

For general tests of most things an ISP does, ICSI's netalyzr testscan't be beat.


http://netalyzr.icsi.berkeley.edu/

There are also tests at m-lab that may be useful:http://www.measurementlab.net/

As in all pieces of software, these may have bugs; netalyzr was underdetecting bufferbloat on high bandwidth links until recently; thisshould be fixed now, I hope.


And SamKnows is doing the FCC broadband tests.

The speedtest.net tests (and pingtest.net) are good as far as they go(and you can host them someplace yourself; as others have noted, havingand endpoint at someplace you control is wise); but they don't tell thewhole story: they miss a vital issue that has been hidden.


Here's the rub:

Most tests have focussed on bandwidth (now misnamed "speed" bymarketing, which it isn't).


Some tests have tested latency.

But there have been precious few that test latency under load, which ishow we've gotten into a world of hurt on broadband over the last decade,where we now have a situation where a large fraction of broadband haslatencies under load measured in *seconds*. (See:http://gettys.wordpress.com/ and bufferbloat.net). These both make forfuming retail customers, as well as lots of service calls (I know, Igenerated quite a few myself over the years). This is a killer for lotsof applications, VOIP, teleconferencing, gaming, remote desktop hosting,etc.

Netalyzr tries to test for excessive buffering, as does at least one ofthe mlabs tests.

Dave Clark and I have been talking to SamKnows and Ookla to try to getlatency under load tests added to the mix. I think we've been havingsome traction at getting such tests added, but it's slightly too soon totell.

We also need tests to identify ISP's failing to run queue managementinternal to their networks, as there is both research and anecdotal datathat shows that that is also much more common than it should be. SomeISP's do a wonderful job, and others don't; Van Jacobson believes thisis because Sally Floyd and his classic RED algorithm is buggy, andtuning it has scared many operators off; I believe his explanation.


So far, so bad.

Then there is the home router/host disaster:

As soon as you move the bottleneck link from the broadband hop to the802.11 link usually beyond it these days (by higher broadband bandwidth,or by having several chimneys in your house as I do, dropping thewireless bandwidth), you run into the fact that home routers and evenour operating systems sometimes have even worse buffering than thebroadband gear, sometimes measured in hundreds or even thousands of*packets*.

We're going to need to fix the home routers and user's operatingsystems. For the 802.11 case, this is hard; Van says RED won't hack it,and we need better algorithms, whether Van's unpublished nRED algorithmor Doug Leith's recent work.

So you need to ensure the regulators' understand that doing testingcarefully enough to know what you are looking at is hard. Tests notdone directly at the broadband gear may mix this problem with thebroadband connection.

This is not to say tests should not be done: we're not going to get thisswamp drained without the full light of day on the issue; just thatcurrent speedtest.net tests misses this entire issue right now (thoughmay detect it in the future), and that the tests (today) aren'tsomething you "just run" and get a simple answer, since the problem canbe anywhere in a path.

Maybe there will be tests that "do the right thing" for regulators in ayear or two; but not now: the tests today don't identify which link isat fault, and that the problem can easily be entirely inside thecustomer's house, if the test tests for bufferbloat at all.

I think it very important we get tests together that not only detectbufferbloat (which is very easy to detect, once you know how), but alsopoint to where in the network the problem is occurring, to reduce therate of complaints to something manageable, where everyone isn't havingto field calls for problems they aren't responsible for (and unable to fix).


You can look at a talk about bufferbloat I gave recently at:
http://mirrors.bufferbloat.net/Talks/BellLabs01192011/

Let me know if I can be of help. People who want to help the bufferbloatproblem please also note we recently opened a bufferbloat.net web siteto help collaboration on this problem.


                        Best regards,
                                Jim Gettys
                                Bell Labs


On 01/06/2011 01:50 PM, Van Jacobson wrote:
> Jim,
>
> Here's the Doug Leith paper I mentioned. As I said on the phone I
> think there's an easier, more robust way to accomplish the same
> thing but they have running code and I don't. You can get their
> mad-wifi implementation at
> http://www.hamilton.ie/tianji_li/buffersizing.html
>
>   - van

Re: help needed - state of california needs a benchmark - beware bufferbloat

Reply via email to