> From: Haya Shulman <haya.shul...@gmail.com> > > I'm puzzled by the explanation of Socket Overloading in > > https://sites.google.com/site/hayashulman/files/NIC-derandomisation.pdf > > I understand it to say that Linux on a 3 GHz CPU receiving 25,000 > > packets/second (500 bytes @ 100 Mbit/sec) spends so much time in > > interrupt code that low level packet buffers overflow.
> Just to clarify, the attacker ran (two to three sync-ed hosts, and the > burst was split among those hosts). No number of hosts sending packets can exceed 25,000 500 Byte packets to a single 100 MHz 802.3 host. In fact, 802.3 preamble, headers, CRC, and IFG limit the 500 Byte packet rate to below 25K pps. However, multiple attacking hosts could cause excessive link-layer contention (nothing to do with host or host network interface "interrupts" or "buffers") and so packet losses in either or both directions for legitimate DNS traffic and so the reported effects. > > Could the packet losses have been due to the system trying to send > > lots of ICMP Port-Unreachables? > But, why would ICMP errors cause loss? Sending ICMP packets requires resources, including wire and hub occupancy, CPU cycles, "interrupts", kernel lock contention, kernel buffers, network hardware buffers, and so on and so forth. Any or all of that can increase losses among the target DNS requests and responses. > Inbound packets have higher priority over outbound packets. I either do not understand that assertion or I disagree with it. I would also not understand or disagree with the opposite claim. At some points in the paths between the wire and the application (or more accurately, between the two applications on the two hosts), one could say that input has higher or lower priority than output, but most of the time the paths contend, mostly first-come-first-served for resources including memory bandwidth, DMA engines, attention from the 802.3 state machine, host/network firmware or hardware queues and locks, kernel locks, application locks, and application thread scheduling. > > How was it confirmed that kernel interrupt handling was the cause > > of the packet losses instead of the application (DNS server) getting > > swamped and forcing the kernel to drop packets instead of putting > This a good question. So, this evaluation is based on the following > observation: when flooding closed ports, or other ports (not the ones on > which the resolver expects to receive the response) - no loss was incurred, > but all connections experience an additional latency; alternately, when > flooding the correct port - the response was lost, and the resolver would > retransmit the request after a timeout. Ok, so a ~100 Mbit/sec attack on non-DNSSEC DNS traffic succeeded on a particular LAN. Without more information, how can more be said? Without more data we should not talk about interrupts, I/O priority, or even whether the attack would work on any other LAN. > I used the default buffers in OS and resolver. So, you think that it could > be that the loss was on the application layer?... I avoid talk about "layers" above the link layer, because the phrases are generally at best unclear and confusing. At worst, the phrases are smoke screens. In this case, there is no need to talk about an "application layer," because we are presumably talking about "two application programs" that are BIND, NSD, and/or Unbound. If BIND was used, then I could (but would try not to) speculate about BIND's threading and request/response handing and consequent request or response dropping. Without data such as packet counts from standard tools such as `netstat`, my bet is what I said before, that the application fell behind, its socket buffer overflowed, and the results were as seen. However, I would not bet too much, because there are many other places where the DNS requests or responses could have been lost including: - intentional rate limiting in the DNS server, perhaps even RRL - intentional rate limiting in the kernel such as iptables - intentional rate limiting in a bridge ("hub") in the path - unintentional link layer rate limiting due to contention for bridge buffers or wires. At full speed from the attacking systems, unrelated cross traffic through hubs in the path or on the wires to DNS server would cause packet losses including losses of valid answers and so timeouts and so the observed effect. > One of the main factors of the attack is `burst > concentration`. That suggests (but certainly does not prove) link layer contention instead of my pet application socket buffer overflow. (I mean overloading of or contention for wires or hubs (or routers?).) A meta-question should be considered. How much time and attention should be given to yet another attack that apparently requires 100 Mbit/sec floods (I don't recall that this paper said how long this attack flood must continue) and only when DNSSEC is not used? Many of us could probably do more interesting things than fuzz DNS caches with access to the LAN where these tests were done--or most LANS. (By "another" I'm referring to the mistaken reports that RRL+SLIP=1 is bad because of non-DNSSEC cache corruption under 4 hour 100 Mbit/sec floods.) Instead of looking for yet more obscure ways (e.g. 100 Mbit/sec floods on LANs) in which non-DNSSEC DNS is insecure, why not enable DNSSEC and declare victory? Vernon Schryver v...@rhyolite.com _______________________________________________ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs