Re: [DNSOP] Re: [secdir] secdir review of draft-ietf-dnsop-reflectors-are-evil-04.txt (fwd)

Brian Dickson Tue, 02 Oct 2007 14:24:43 -0700

Dean Anderson wrote:

I think this may be of interest. It was offlist, so I won't identify theauthor I am responding to.

[Did you think to perhaps ask the author first? He/she may have beenwilling to be identified...]

I.   Harm only possible for ENDSO; Update RFC 2671 Instead
The maximum non-EDNS amplification factor is 8

8x can be significant.

Yes. But they can get more than that from a couple hundred+ root
servers.

You miss the point. Your original comment was 100% wrong. You even admitso, but don't seem to realize it.

"Not possible" you wrote.
"Can be" he/she replied.
"Yes, but [...]" you wrote.

The fact that something else is a *bigger* risk, doesn't have anybearing on whether the first thing is a risk.


Consider:

Banks have *way* more money than convenience stores. (That fact is evencommon knowledge.)However, it doesn't seem to have much impact on their respective robberyrates.

Furthermore, the real danger here is the ability to mount a_distributed_ attack, in which large numbers of servers send bogusresponses at a rate far beyond that which the original authoritative servercould manage. That requires caching servers; a handful of authoritativeservers mostly on the same network won't cut it.
Caching servers are not a requirement for a distributed attack. To
conduct the attack with any servers (caching or authority), one still
needs a botnet to send the spoofed source packets.  This botnet is
amplified by the same factor, whatever type of server is used. Thus,the type of server is irrelevant to the damage caused.

In any DDOS attack, the multiplier effect is not unconstrained.

If the devices being used *as* the multipliers by any of theparticipants, are shared among participants,

then the aggregate result can, and likely will, be limited by those devices.

A botnet of 100,000 using a set of 20 servers each on a 1Gbps link, cangenerate at most 20 * 1Gbps of traffic,

no matter how massive the aggregate rate of the botnet is.

At some point, there is even a crossover, where the botnet could do moredamage than the so-called multiplier.If the botnet of 100,000 could each send 33 kbps, i.e dial-up speeds,that would be 33 * 100,000 kbps, or 33 Gbps.


More than that set of 20 servers could generate at line rate.

For example, if an anycast nameserver address is backed by three
servers and one of them is down, but the routing is such that only 10%
of requests go to the down server, then on average, clients will see
10% of their requests dropped, not one third.


This is overly simplistic. It is not true for stateful DNS packets,

because packets can always be routed to multiple anycast instances.Earlier, I had thought that this could only happen with PPLB, and in

theory (RFC 1812 could happen at anytime. I've since tested this
experimentally. Theory is was right. Even routers that route using flow
caches expire those cache entries every 60 seconds.  Send 2 packets more
than 60 seconds apart, and they can go to different servers. I've
detected anycast open recursors this way.

The vast majority of equipment used on high-capacity links, is neitherPPLB, nor cache flow switched.It is pre-computed FIB in hardware (ASIC) switched, via TCAM. Thoseentries do not expire.They *do* periodically change when route selection to destinationschange, something that happens

often enough on the aggregate - but not often on individual prefixes.

The PPLB and cache-aging occurs near the edge, on smaller networks.There are a very limited numberof locations where this is even theoretically visible, and in mostcases, solution to this is under operatorcontrol (e.g. deploy software that does CEF/DCEF instead of cachedswitching, and possibly upgrade

obsolete hardware that is unable to do CEF which is now end-of-life.)

But to tell how bad the anycast problem is on an authority server (such
as the root or TLD servers), one needs to identify uniquely (with NSID),
the instance each query goes to, and measure how often one gets a
different server.

No, one does not need to identify specific servers. It is sufficient to*disambiguate* servers, to separate outloci in the vernacular. Once disambiguated, what their identity is, isnot of primary importance.

And, it is not difficult to disambiguate anycast instances, given asufficient number of places from which to observe.Traceroutes, ttl values on dns responses, round trip times, all can beused to help isolate instances from each other.

It would be nearly impossible to distinguish servers *within* an anycastinstance, e.g. behind any kind of load balancer.

Without anycast, load balancers would in theory also contribute to theoccasional problem of stale cache removal affectingstateful DNS traffic. Not just in theory, but in practice. Try doing DNSqueries from very close by a given authority server.Do so over tcp, with a connection that lasts more than, for example, 300seconds. See if what you observe looks the same

as your alleged "anycast" bogeyman.

[ I invite you to put your reputation on the line. Build a tool thatdoes the above, and sends mail as you, to dnsop.Run it blindly, i.e. without advance knowledge of the results. Viewthe results, the same as us, on the list.

 I double dog dare you.]



And what exactly *is* "the anycast problem"?

You refer to "the anycast problem" here, as if it is common knowledge.

I am not familiar with the actual existence, let along details of, ananycast problem.

Yet you refer to it as *the* anycast problem.

Could you elaborate on what it is that you are referring to here?

I think the dnsop folks share interest in fully exploring problems, ifthey exist, and it is the right venue.

If you are quite certain that there is a problem, and you do seem to be,can you do a better job of

identifying it than just using the term, "the anycast problem"?

What is the scope?
What is the nature?
How does one detect it?
What are the problems observed?
Where are the boundaries of the problem space?
What is excluded from the problem space?

Don't just answer these off-hand, one at a time. Take the time. Writedown your ideas in a cohesive,structured manner. Present them to the list. Provide data backing upyour anecdotes. Statistics are good.Especially when the methodology for how they were gathered is included,as well as locality, timestamps,etc., so the data points can be verified and possibly cross-referencedagainst routing activity at the time.

Any trend in behavior of name resolution, which has a relationship torouting events, is something that

the folks on dnsop have more than a passing interest in.

Alternately, one can indirectly use TCP, or indirectly use reflectors to
make measurements. These measurements can't be as accurate as NSID
measurements would be.

Now, the ability to distinguish between servers is useful if you are tryingto determine the cause of the failures, but is completely unnecessary fordetermining that the service has 90% reliability.


Incorrect, as shown above.  Besides, we expect high availability from
from root servers.  Anycast appears to give no better than about 97%
over TCP under ideal conditions (from a paper on Nanog by Anycast HTTP
advocate).  It might not be that good, even. A figure of 90% would be
abysmally bad. 3% packet loss is usually unservicable

The 90% in the original authors example was meant to illustrate themeasurable impact on reliability,of a specific failure rate of a component, to show that there isn'talways a linear relationship.

The author never ascribe a specific rate to availability of rootservers, and certainly not 90%.Red herring. Let this "90%" sub-thread die, please. Or maybe you didn'trealize what the OA meant?


Brian

_______________________________________________
DNSOP mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] Re: [secdir] secdir review of draft-ietf-dnsop-reflectors-are-evil-04.txt (fwd)

Reply via email to