On 01/29/2011 01:00 PM, Mike wrote:
Hello,
My company is small clec / broadband provider serving rural communities
in northern California, and we are the recipient of a small grant from
the state thru our public utilities commission. We went out to 'middle
of nowhere' and deployed adsl2+ in fact (chalk one up for the good
guys!), and now that we're done, our state puc wants to gather
performance data to evaluate the result of our project and ensure we
delivered what we said we were going to. Bigger picture, our state is
actively attempting to map broadband availability and service levels
available and this data will factor into this overall picture, to be
used for future grant/loan programs and other support mechanisms, so
this really is going to touch every provider who serves end users in the
state.
The rub is, that they want to legislate that web based 'speedtest.com'
is the ONLY and MOST AUTHORITATIVE metric that trumps all other
considerations and that the provider is %100 at fault and responsible
for making fraudulent claims if speedtest.com doesn't agree. No
discussion is allowed or permitted about sync rates, packet loss,
internet congestion, provider route diversity, end user computer
performance problems, far end congestion issues, far end server issues
or cpu loading, latency/rtt, or the like. They are going to decide that
the quality of any provider service, is solely and exclusively resting
on the numbers returned from 'speedtest.com' alone, period.
All of you in this audience, I think, probably immediately understand
the various problems with such an assertion. Its one of these situations
where - to the uninitiated - it SEEMS LIKE this is the right way to do
this, and it SEEMS LIKE there's some validity to whats going on - but in
practice, we engineering types know it's a far different animal and
should not be used for real live benchmarking of any kind where there is
a demand for statistical validity.
My feeling is that - if there is a need for the state to do
benchmarking, then it outta be using statistically significant
methodologies for same along the same lines as any other benchmark or
test done by other government agencies and national standards bodies
that are reproducible and dependable. The question is, as a hotbutton
issue, how do we go about getting 'the message' across, how do we go
about engineering something that could be considered statistically
relevant, and most importantly, how do we get this to be accepted by
non-technical legislators and regulators?
Mike,
For general tests of most things an ISP does, ICSI's netalyzr tests
can't be beat.
http://netalyzr.icsi.berkeley.edu/
There are also tests at m-lab that may be useful:
http://www.measurementlab.net/
As in all pieces of software, these may have bugs; netalyzr was under
detecting bufferbloat on high bandwidth links until recently; this
should be fixed now, I hope.
And SamKnows is doing the FCC broadband tests.
The speedtest.net tests (and pingtest.net) are good as far as they go
(and you can host them someplace yourself; as others have noted, having
and endpoint at someplace you control is wise); but they don't tell the
whole story: they miss a vital issue that has been hidden.
Here's the rub:
Most tests have focussed on bandwidth (now misnamed "speed" by
marketing, which it isn't).
Some tests have tested latency.
But there have been precious few that test latency under load, which is
how we've gotten into a world of hurt on broadband over the last decade,
where we now have a situation where a large fraction of broadband has
latencies under load measured in *seconds*. (See:
http://gettys.wordpress.com/ and bufferbloat.net). These both make for
fuming retail customers, as well as lots of service calls (I know, I
generated quite a few myself over the years). This is a killer for lots
of applications, VOIP, teleconferencing, gaming, remote desktop hosting,
etc.
Netalyzr tries to test for excessive buffering, as does at least one of
the mlabs tests.
Dave Clark and I have been talking to SamKnows and Ookla to try to get
latency under load tests added to the mix. I think we've been having
some traction at getting such tests added, but it's slightly too soon to
tell.
We also need tests to identify ISP's failing to run queue management
internal to their networks, as there is both research and anecdotal data
that shows that that is also much more common than it should be. Some
ISP's do a wonderful job, and others don't; Van Jacobson believes this
is because Sally Floyd and his classic RED algorithm is buggy, and
tuning it has scared many operators off; I believe his explanation.
So far, so bad.
Then there is the home router/host disaster:
As soon as you move the bottleneck link from the broadband hop to the
802.11 link usually beyond it these days (by higher broadband bandwidth,
or by having several chimneys in your house as I do, dropping the
wireless bandwidth), you run into the fact that home routers and even
our operating systems sometimes have even worse buffering than the
broadband gear, sometimes measured in hundreds or even thousands of
*packets*.
We're going to need to fix the home routers and user's operating
systems. For the 802.11 case, this is hard; Van says RED won't hack it,
and we need better algorithms, whether Van's unpublished nRED algorithm
or Doug Leith's recent work.
So you need to ensure the regulators' understand that doing testing
carefully enough to know what you are looking at is hard. Tests not
done directly at the broadband gear may mix this problem with the
broadband connection.
This is not to say tests should not be done: we're not going to get this
swamp drained without the full light of day on the issue; just that
current speedtest.net tests misses this entire issue right now (though
may detect it in the future), and that the tests (today) aren't
something you "just run" and get a simple answer, since the problem can
be anywhere in a path.
Maybe there will be tests that "do the right thing" for regulators in a
year or two; but not now: the tests today don't identify which link is
at fault, and that the problem can easily be entirely inside the
customer's house, if the test tests for bufferbloat at all.
I think it very important we get tests together that not only detect
bufferbloat (which is very easy to detect, once you know how), but also
point to where in the network the problem is occurring, to reduce the
rate of complaints to something manageable, where everyone isn't having
to field calls for problems they aren't responsible for (and unable to fix).
You can look at a talk about bufferbloat I gave recently at:
http://mirrors.bufferbloat.net/Talks/BellLabs01192011/
Let me know if I can be of help. People who want to help the bufferbloat
problem please also note we recently opened a bufferbloat.net web site
to help collaboration on this problem.
Best regards,
Jim Gettys
Bell Labs
On 01/06/2011 01:50 PM, Van Jacobson wrote:
> Jim,
>
> Here's the Doug Leith paper I mentioned. As I said on the phone I
> think there's an easier, more robust way to accomplish the same
> thing but they have running code and I don't. You can get their
> mad-wifi implementation at
> http://www.hamilton.ie/tianji_li/buffersizing.html
>
> - van