On Tue, 2017-08-15 at 02:45 +0200, Paweł Staszewski wrote: > > W dniu 2017-08-14 o 18:57, Paolo Abeni pisze: > > On Mon, 2017-08-14 at 18:19 +0200, Jesper Dangaard Brouer wrote: > >> The output (extracted below) didn't show who called 'do_raw_spin_lock', > >> BUT it showed another interesting thing. The kernel code > >> __dev_queue_xmit() in might create route dst-cache problem for itself(?), > >> as it will first call skb_dst_force() and then skb_dst_drop() when the > >> packet is transmitted on a VLAN. > >> > >> static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv) > >> { > >> [...] > >> /* If device/qdisc don't need skb->dst, release it right now while > >> * its hot in this cpu cache. > >> */ > >> if (dev->priv_flags & IFF_XMIT_DST_RELEASE) > >> skb_dst_drop(skb); > >> else > >> skb_dst_force(skb); > > I think that the high impact of the above code in this specific test is > > mostly due to the following: > > > > - ingress packets with different RSS rx hash lands on different CPUs > yes but isn't this normal ? > everybody that want to ballance load over cores will try tu use as many > as possible :) > With some limitations ... best are 6 to 7 RSS queues - so need to use 6 > to 7 cpu cores > > > - but they use the same dst entry, since the destination IPs belong to > > the same subnet > typical for ddos - many sources one destination
Nobody hit this issue yet. We usually change the kernel, given typical workloads. In this case, we might need per cpu nh_rth_input Or try to hack the IFF_XMIT_DST_RELEASE flag on the vlan netdev.