> On 11 Jul 2019, at 4:00 am, Paul Vixie <p...@redbarn.org> wrote:
> 
> i like marka's proposed solution below, a lot. and muks' is also clever, 
> though requiring wire protocol changes. however, fujiwara-san's proposal 
> describes a broader array of fragmentation problems than just integrity, and 
> we should be looking at that broader array when making our plans.
> 
> i think there's a broader question about fragmentation itself. at the outset, 
> i knew, when creating EDNS0, that fragmentation was considered harmful:
> 
> https://www.hpl.hp.com/techreports/Compaq-DEC/WRL-87-3.pdf
> 
> noting, jeff mogul and chris kanterjiev (kent), authors of the above tech 
> report, were two of my mentors and bosses at d|i|g|i|t|a|l from 1988 to 1993, 
> so, i read all of their work, and i discussed my questions and objections 
> with 
> them. fragmentation was, in ipv4, harmful. there can be no argument at this 
> late date. i sinned egregiously in EDNS0 by opening the door to 
> fragmentation, 
> and the results have been predictably painful and expensive for everybody.
> 
> however, IPv6 intended to, promised to, and claimed to, fix V4 
> fragmentation's 
> many defects. i must have been thinking optimistic thoughts about the IPv6 
> time line, and IPv6's promises and intentions, when i opened the 
> fragmentation 
> door in EDNS0. what actually then happened was that ICMPv6 as required for 
> PMTUD6 was not secure and could not be implemented, which means any 
> fragmentation done in IPv6 (which unlike IPv4, is an endpoint-only activity) 
> will be uninformed about path MTU. thus we make pessimistic assumptions like 
> 1500 and 1220. and if that's the kind of fragmentation we can actually get, 
> then it's a negative value, and fragmentation in IPv6 is as bad, for 
> different 
> reasons, than in IPv4.

IPv6 moved fragmentation to the originating node from the router to allow for
faster routers.

As for PMTUD and DNS/UDP it was clear in 1998 that DNS servers would need a
mechanism to avoid PMTUD issues so I proposed draft-ietf-ipngwg-bsd-frag (1998)
which got incorporated into RFC 3542 (2003) as IPV6_USE_MIN_MTU.  Named uses
that mechanism when the OS supports it.

Limiting transmitted EDNS/UDP responses just moves PMTUD issues to TCP unless
care is taken to limit MSS sizes.  This can be done by a number of mechanisms
but is not universally achievable.  DNS/TCP is not a panacea for DNS/UDP
fragmentation issues.

PMTUD is a issue for a number of reasons.  Routers still have PTB generation on
the slow path which means that it ends up getting rate limited.  PTB generation
should have been moved into ASICs a decade ago.  Firewalls that block PTB 
packets.
Packet load balancers that fail to look into the ICMP payload when forwarding 
PTB
messages.

> therefore, for the reasons set out by fujiwara-san in his recent draft posted 
> here, and especially for the reasons spelled out by his extensive references, 
> DNS should not use fragmentation. while some of kazunori's examples have to 
> do 
> with message integrity and attacks such as the shulman method, the case 
> against fragmentation in DNS's use of UDP is immensely strong. solving for 
> the 
> integrity problems doesn't change our conclusion, and adds more complexity.
> 
> i have two final notes, which may help inform those who witnessed the sham 
> consensus railroaded (soviet-style) through the recent DNS-OARC meeting in 
> bangkok, and heard me speak against outlawing fragmentation as a 2020 Flag 
> Day 
> goal, and are now hearing me contradict myself.
> 
> ---
> 
> first, we need fragmentation to work, which means we need path MTU discovery 
> to work, which means we need ICMP to be secure, at least in IPv6. while use 
> of 
> fragmentation for DNS UDP has a high cost, the intentional investment of that 
> cost would be a beneficial forcing function on fixing fragmentation itself. 
> notably, TCP avoids fragmentation through its MSS signaling, which defaults 
> to 
> MIN(myMTU, herMTU) minus some fudge factor for protocol headers. which means 
> lack of fragmentation does not hurt the web, and so nobody cares about it. 
> but 
> we should, all, care about it. i'll explain further in an upcoming article 
> which i'll link here, but briefly, 1500 is the wrong LAN MTU for FastE, and 
> is 
> insanely small for 1GE, unthinkably wrong for 10GE, laughable for 40GE, and 
> engineering malpractice for 100GE. for a test, do a bunch of NFS and SMB 
> tests, over both UDP and TCP for each protocol, using jumbo grams (9K MTU) 
> and 
> then again using standard (1500 MTU) sized data grams. watch for transfer 
> speed, CPU utilization, and network utilization (as bits, not as packets).
> 
> i will at some point teach FreeBSD TCP how to fragment its first TCP segment 
> after synchronization, but only for IPv6. my goal is to force IPv4 fallback 
> if 
> IPv6 with all of its promised PMTUD and endpoint-only fragmentation does not 
> work. let every network operator whose key performance indicators include 
> IPv6 
> deployment levels, begin to fear that without its PMTUD promises, IPv6 is not 
> good enough to replace IPv4, and they will have to plan on investments in 
> dual-stack, _forever_.
> 
> ---
> 
> second, all mass is energy, and state in the network should be thought of as 
> having mass. PMTUD has some scale problems regarding endpoint state 
> requirements, and so, has to work well enough for fast LRU purges of state 
> required for endpoint MTU information, which will lead to rapid rediscovery. 
> but, TCP protocol control blocks are also state, and state has mass. a world 
> in which every recursive iterating server has long-running TCP/853 (DoT) 
> connections open to hundreds or thousands of authority servers is not going 
> to 
> be inexpensive, either for the initators or the responders. the web works 
> this 
> way, but can require tens of gigabytes of kernel memory for the TCP state 
> alone. that's not a good ratio or mass to value. importantly, fragmentation 
> has another state mass cost, which is transmitting the fragments with enough 
> inter-packet gap to avoid microbursts which overflow the switch port buffers, 
> and receiving fragments which must be reassembled before they can be 
> delivered. all of this is wrong.
> 
> william simpson, perry metzger, and paul vixie (me) worked together about ten 
> years ago to create TCP enhancements which would have permitted an unlimited 
> number of quiescent but open TCP connections, at a per-connection state cost 
> precisely equal to the cost of resisting a SYN flood attack. so, highly 
> compressed state, because state mass is a high cost at the network-wide 
> level. 
> we also supported payloads large enough for DNS or WWW queries in the 
> synchronization phase, fixed the security problems around RST, expanded the 
> option header space, and saved the window size during periods of connection 
> quiescence, allowing back-to-back-to-back transmissions once cookies had been 
> exchanged. the result was RFC 6013, which was entirely ignored by the people 
> who brought us TCPFO, which has the same incompressible state as TCP, adds no 
> security, and reduces only the problem of round trip costs. the other 
> document 
> besides RFC 6013 that may be of interest is here:
> 
> https://www.usenix.org/system/files/login/articles/126-metzger.pdf
> 
> metzger, simpson, and vixie (me) are all notoriously difficult to work with, 
> and this stems from correctable personality defects and unforced human 
> protocols errors for which we should each be periodically upbraided. however, 
> ignoring our work because we're somewhat irritating runs the risk of taking 
> the internet itself down a blind alley from which a later return won't earn 
> us 
> thanks from the grandchildren.
> 
> ---
> 
> in summary, the network needs working fragmentation so that it can have a 
> future that isn't constrained by the physics of thickwire ten megabit 
> ethernets, and if the DNS community were willing to join the fight, it would 
> be a shorter fight. however, DNS, and UDP itself, is better off without 
> fragmentation, because of state mass and complexity costs, regardless of 
> whether we can solve fragmentation's integrity and substitution weaknesses.
> -- 
> Paul
> 
> 
> _______________________________________________
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742              INTERNET: ma...@isc.org

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to