On Wed, Mar 9, 2016 at 11:07 AM, Brian E Carpenter <[email protected]> wrote: > On 10/03/2016 06:12, Elwyn Davies wrote: >> I am the assigned Gen-ART reviewer for this draft. > > And I am another Gen-ART reviewer who saw this go by: > > ... >> Minor issues: Treatment of packets that don't fit into the hashing >> classification scheme: The default FQ-CoDel hashing >> mechanism uses the protocol/addresses/ports 5-tuple, but there will be >> packets that don't fit this scheme (especially ICMP). >> There is no mention of what the classification would do with these packets. >> I guess that one extra queue would probably >> suffice to hold any such outliers, but it would be wise to say something >> about how the packets from this/these queue(s) would >> be treated by the scheduler. It might also be useful to say something about >> treatment of outliers in other classification >> schemes, if only to say that the scheme used needs to think about any such >> outliers. > > For IPv6 there is another issue here: the well-known difficulty in finding > the protocol and port numbers at line speed when extension headers are > present. > Which is of course why IPv6 senders are supposed to set the flow label, which > should in turn provide a handy 3-tuple (addresses/flow label) that would be > ideal for CoDel. The operational facts today are that most hosts don't set > the flow label and many paths through the Internet drop packets with extension > headers, but the fact that the 5-tuple is problematic in this way might be > worth mentioning.
I tend to think operational issues with the flow label are going to continue to prevent it from being used widely, and those 24 bits essentially unused for eternity. (I would have liked a convenient, standardized place to stick a vpn CTR seqno, actually... /me ducks) Two more common scenarios: are dealing with vlans (and it's 802.q) priorities sanely, and MPLS. We don't talk about these because support has only recently landed in the new hashing api in Linux... As for ipv6's potential set of headers to decode, the Linux implementation does try very hard to get to the last header to decode. (The first BSD implementation did not, as best as I recall). But: In part the 5 tuple requirement is driven by ipv4 NAT, where the src ip is static on the gateway and the ports change. In the IPv6 case a source box often has several ipv6 addresses to choose from (and does), making even a 3 tuple on ipv6 more effective, no matter the protocol. I tried for much of the past two years to get public in-fpga-hardware development of fq_codel derived algorithms, even helping fund - https://www.kickstarter.com/projects/onetswitch/onetswitch-open-source-hardware-for-networking . (site: http://www.meshsr.com/ ) It's a really nice board. I have given away a bunch of them... only to realize that with the present state of the art in FPGA design tools, existing public IP, and very scarce hardware engineers, that it would be more difficult than I'd imagined, even though, DRR and AQM implementations have been produced as part http://netfpga.org/ 's stuff. I retain hope, that perhaps driven by the publication of this draft, that a public hardware implementation will appear by some energized students. Timestamping in hardware - easy. Hashing, also easy, but it makes cut-through switching a bit more delayed. The inherent parallelism of having these fq_codel queues might make next-hop caches "hotter" and searchable in parallel... and while I know 10Gbits is achievable (achieved), at 100gbit, I'll be damned if I know how we're going to do things at .67ns each. Given that we can run things with a shaper, with cake, in software today, at a gbit, it has generally been my hope that hardware implementations would appear attempting to make things better at sub 1gbit speeds, mitigated by the slower you go, the more problems you solve with fq_codel, and the lower the cpu overhead to shape the traffic, to where even the cheapest hardware you can buy handles up to 60mbit handily. The biggest overhead we have is not any of this stuff, but software rate shaping. I'd love to see hardware that can be programmed at a non-line rate on outbound. Certain 40Gbit cards already have this ability... (mellonix) In terms of overheads on software implementations, fq_codel and cake have been pushed to 40Gbit on big packets on high end intel boxes. There are a lot of problems on the rx side of the Linux path, however... http://www.netdevconf.org/1.1/proceedings/slides/ has tons of talk about it - specifically: http://www.netdevconf.org/1.1/proceedings/slides/dangaard-network-performance.pdf which was also filmed somewhere. Anyway, going back to revising the draft, if something like my third paragraph above needs to be incorporated, ok. We could mention the flow label as a potential thing to hash on, if you want. > > Brian _______________________________________________ Gen-art mailing list [email protected] https://www.ietf.org/mailman/listinfo/gen-art
