On Tue, May 15, 2018 at 02:08:09PM -0700, Eric Dumazet wrote: > > > On 05/15/2018 12:31 PM, Flavio Leitner wrote: > > Hi, > > > > There is a significant throughput issue (~50% drop) for a single TCP > > stream when the skb is scrubbed and XPS is enabled. > > > > If I turn CONFIG_XPS off, then the issue never happens and the test > > reaches line rate. The same happens if I echo 0 to tx-*/xps_cpus. > > > > It looks like that when the skb is scrubbed, there is no more reference > > to the struct sock, > > And this is really the problem here, since it breaks back pressure (and TCP > Small queues) > > I am not sure why skb_orphan() is used in this scrubbing really. >
veth originally called skb_orphan() on veth_xmit() most probably because there was no TX completion. Then the code got generalized to dev_forward_skb() and later on moved to skb_scrub_packet(). The issue is that we call skb_scrub_packet() on TX and RX paths and that is done while crossing netns. It doesn't look correct to keep the ->sk because I suspect that iptables/selinux/bpf, or some code path that I am probably missing could expose/use the wrong ->sk, for example. However, netdev_pick_tx() can't store the queue mapping without ->sk. The hack in the first email relies on the headers (skb_tx_hash) to always selected the same TX queue, which solves the original problem but not the TCP small queues you mentioned. -- Flavio