On Mon, Jan 14, 2019 at 2:12 PM Martijn van Oosterhout <klep...@svana.org> wrote: > > Hi netdev, > > We're running into an issue where incoming traffic for Suricata is not > being distributed across the workers despite AF_PACKET with fanout > being used, and it appears to be a kernel issue. Below is a description > of the problem and possible solution. > > Seen on version kernel 4.19, but the code on 4.20 seem largely > unchanged. > > When a packet needs to be distributed by fanout it calls > net/packet/af_packet.c:fanout_demux_hash which in turns calls > net/core/flow_dissector.c:__skb_get_hash_symmetric which in turn calls > net/core/flow_dissector.c:__skb_flow_dissect. However, if you look at > the code that parses MPLS traffic it looks like so: > > --- snip --- > net/core/flow_dissector.c:1023 > case htons(ETH_P_MPLS_UC): > case htons(ETH_P_MPLS_MC): > fdret = __skb_flow_dissect_mpls(skb, flow_dissector, > target_container, data, > nhoff, hlen); > break; > --- snip --- > > What's going on here is that the dissector goes to extract the MPLS > flow information and then stops (it returns either GOOD or BAD here). > However because flow_keys_dissector_symmetric does not include > FLOW_DISSECTOR_KEY_MPLS no information is extracted at all, with the > result that the hash is always the same for every packet. > > I see a two ways this could be fixed. > > Option 1: include FLOW_DISSECTOR_KEY_MPLS in > flow_keys_dissector_symmetric but that seems a big assumption, we don't > do that for VLANs for example.
This sounds fine to me. Though it will require extra work to make __skb_get_has_symmetric actually use the entropy. And in practice it's not clear that this will result in much entropy. > Option 2: Teach the dissector to, in the case where there is an MPLS > header that is not for entropy, to skip the MPLS header(s) and continue > the dissection on the IP headers that come after the MPLS header. > > I think option 2 seems to me the right approach, however the dissector > (AFAICT) is used extensively from many places in the kernel so I'd like > some confirmation before spending too much time on it. It seems like it > could lead to an unexpected performance impact on systems using MPLS. > Or perhaps there is something else going on I missed. > > And there is actually another problem: MPLS provides no information > about the next header because it assumes the endpoints in the network > recognise the MPLS headers. Which means you'd have to make a guess > about what the next layer should be. This is the real issue. I don't think this can be done in general purpose code. The new BPF flow dissector, however, does allow you to deploy a custom dissector in environments where the inner protocol is known. https://lwn.net/Articles/764200/