On Jul 29, 2009, at 12:34 AM, Bill Woodcock wrote:
So I've embarked on the no-doubt-futile task of trying to interpret
SLAs as empirically-verifiable technical specifications, rather than
as marketing blather. And there's something that I'm finding
particularly puzzling:
In most SLAs, there seem to be two separate guarantees proffered:
one concerning "network availability" and one concerning "packet
loss." Now, if I were to put my engineer hat on, and try to
_imagine_ what the difference might be, I might imagine "network
availability" to have something to do with layer-2 link status being
presented as "up," while packet loss would be the percentage of
packets dropped. But when I actually read SLAs, "network
availability" is generally defined as the portion of the month that
the path from the customer's local loop to the transit or peering
routers was "available" to transmit packets. Packet loss, on the
other hand, is generally defined as the portion of packets which are
lost while crossing that exact same piece of network.
Now, what am I missing here? Is this one of those Heisenberg
things, where "network availability" is the time the network _could
have_ delivered a packet _when you weren't actually doing so_, while
"packet loss" is the time the network _couldn't_ deliver a packet
when you _were_ actually doing so?
Is "network availability" inherently unmeasurable on a network
that's less than 100% utilized?
Am I over-thinking this?
Yes. But not because you are coming to strange conclusions, but
because (as you say in your first sentence), you are trying to put
empirical / objective meaning to marketing blather.
I had a simple way to fix this. I defined a network as "down" with
more than X% packet loss (usually with X in the 2-5 range, depending
on other deal parameters). IMHO, a network with 5% packet loss -is-
down. I don't know about you, but none of my customers will use my
service if they have 5% loss. TCP is finicky! This receives the
strongest credit because you cannot use the service.
Below X, you are not "down", just degraded, and therefore the link has
some utility, but not 100% utility. This receives a credit, but not
as strong a credit as being unable to use a link.
Oh, and, of course, if the there is no light on the fiber, then we are
(obviously) "down" as well.
Make sense?
Or I am over-thinking it? :)
--
TTFN,
patrick
P.S. Now you get to think about things like "packet loss to / from
where?" and whether the last mile should count.
Seriously, though, I know there are people who don't consider SLAs
to be fantasy-fiction, and some of them must not be innumerate, and
some subset of those must be on NANOG, and the intersection set
might be equal to or greater than one, right? Can anybody explain
this to me in a way I can translate into code, while still taking
myself seriously?
-Bill