On Tue, Sep 5, 2017 at 4:14 AM, Hannes Frederic Sowa <han...@stressinduktion.org> wrote: > Tom Herbert <t...@herbertland.com> writes: > >> There is absolutely no requirement in IP that packets are delivered in >> order-- there never has been and there never will be! If the ULP, like >> Ethernet encapsulation, requires in order deliver then it needs to >> implement that itself like TCP, GRE, and other protocols ensure that >> with sequence numbers and reassembly. All of these hoops we do make >> sure that packets always follow the same path and are always in order >> are done for benefit of middlebox devices like stateful firewalls that >> have force us to adopt their concept of network architecture-- in the >> long run this is self-defeating and kills our ability to innovate. >> >> I'm not saying that we shouldn't consider legacy devices, but we >> should scrutinize new development or solutions that perpetuate >> incorrect design or bad assumptions. > > So configure RSS per port and ensure no fragments are send to those > ports. This is possible and rather easy to do. It solves the problem > with legacy software and it spreads out packets for your applications. > > It is not perfect but it is working and solves both problems. > Hannes,
I don't see how that solves anything. The purpose of RSS is to distribute the load of protocol packets across queues. This needs to work for UDP applications. For instance, if I were building a QUIC server I'd want the sort of flow distribution that a TCP server would give. You can't do that by configuring a few ports in the device. If I were to suggest any HW change it would be to not do DPI on fragments (MF or offset is set). This ensures that all packets of the fragment train get hashed to the same queue and is on fact what RPS has been doing for years without any complaints. But even before I'd make that recommendation, I'd really like understand what the problem actually is. The only thing I can garner from this discussion and the Intel patch is that when fragments are received OOO that is perceived as a problem. But the by the protocol specification clearly says this is not a problem. So the questions are: who is negatively affected by this? Is this a problem because some artificial test that checks for everything to be in order is now failing? Is this affecting real users? Is this an issue in the stack or really with some implementation outside of the stack? If it is an implementation outside of the stack, then are we just bandaid'ing over someone else's incorrect implementation by patching the kernel (like would have be the case if we change the kernel to interoperate with Facebook's switch that couldn't handle OOO in twstate). Thanks, Tom > Bye, > Hannes