On Tue, Feb 20, 2018 at 02:32:04PM +0100, Jesper Dangaard Brouer wrote: > The virtio_net code have three different RX code-paths in receive_buf(). > Two of these code paths can handle XDP, but one of them is broken for > at least XDP_REDIRECT. > > Function(1): receive_big() does not support XDP. > Function(2): receive_small() support XDP fully and uses build_skb(). > Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb(). > > The simple explanation is that receive_mergeable() is broken because > it uses napi_alloc_skb(), which violates XDP given XDP assumes packet > header+data in single page and enough tail room for skb_shared_info. > > The longer explaination is that receive_mergeable() tries to > work-around and satisfy these XDP requiresments e.g. by having a > function xdp_linearize_page() that allocates and memcpy RX buffers > around (in case packet is scattered across multiple rx buffers). This > does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because > we have not implemented bpf_xdp_adjust_tail yet). > > The XDP_REDIRECT action combined with cpumap is broken, and cause hard > to debug crashes. The main issue is that the RX packet does not have > the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing > skb_shared_info to overlap the next packets head-room (in which cpumap > stores info). > > Reproducing depend on the packet payload length and if RX-buffer size > happened to have tail-room for skb_shared_info or not. But to make > this even harder to troubleshoot, the RX-buffer size is runtime > dynamically change based on an Exponentially Weighted Moving Average > (EWMA) over the packet length, when refilling RX rings. > > This patch only disable XDP_REDIRECT support in receive_mergeable() > case, because it can cause a real crash. > > IMHO we should consider NOT supporting XDP in receive_mergeable() at > all, because the principles behind XDP are to gain speed by (1) code > simplicity, (2) sacrificing memory and (3) where possible moving > runtime checks to setup time. These principles are clearly being > violated in receive_mergeable(), that e.g. runtime track average > buffer size to save memory consumption. > > In the longer run, we should consider introducing a separate receive > function when attaching an XDP program, and also change the memory > model to be compatible with XDP when attaching an XDP prog.
I agree with a separate function approach. So each buffer is tagged as xdp/non xdp, we check that and handle appropriately - where non xdp could be handled by the generic path. > Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT") > Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com> > --- > drivers/net/virtio_net.c | 7 ------- > 1 file changed, 7 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 626c27352ae2..0ca91942a884 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -677,7 +677,6 @@ static struct sk_buff *receive_mergeable(struct > net_device *dev, > struct bpf_prog *xdp_prog; > unsigned int truesize; > unsigned int headroom = mergeable_ctx_to_headroom(ctx); > - int err; > > head_skb = NULL; > > @@ -754,12 +753,6 @@ static struct sk_buff *receive_mergeable(struct > net_device *dev, > goto err_xdp; > rcu_read_unlock(); > goto xdp_xmit; > - case XDP_REDIRECT: > - err = xdp_do_redirect(dev, &xdp, xdp_prog); > - if (!err) > - *xdp_xmit = true; > - rcu_read_unlock(); > - goto xdp_xmit; > default: > bpf_warn_invalid_xdp_action(act); > case XDP_ABORTED: