On 2018/07/24 10:02, Jakub Kicinski wrote: > On Mon, 23 Jul 2018 00:13:05 +0900, Toshiaki Makita wrote: >> From: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> >> >> This allows NIC's XDP to redirect packets to veth. The destination veth >> device enqueues redirected packets to the napi ring of its peer, then >> they are processed by XDP on its peer veth device. >> This can be thought as calling another XDP program by XDP program using >> REDIRECT, when the peer enables driver XDP. >> >> Note that when the peer veth device does not set driver xdp, redirected >> packets will be dropped because the peer is not ready for NAPI. > > Often we can't redirect to devices which don't have am xdp program > installed. In your case we can't redirect unless the peer of the > target doesn't have a program installed? :(
Right. I tried to avoid this case by converting xdp_frames to skb but realized that should not be done. https://patchwork.ozlabs.org/patch/903536/ > Perhaps it is time to reconsider what Saeed once asked for, a flag or > attribute to enable being the destination of a XDP_REDIRECT. Yes, something will be necessary. Jesper said Tariq had some ideas to implement it. > >> v2: >> - Drop the part converting xdp_frame into skb when XDP is not enabled. >> - Implement bulk interface of ndo_xdp_xmit. >> - Implement XDP_XMIT_FLUSH bit and drop ndo_xdp_flush. >> >> Signed-off-by: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> >> --- >> drivers/net/veth.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 45 insertions(+) >> >> diff --git a/drivers/net/veth.c b/drivers/net/veth.c >> index 4be75c58bc6a..57187e955fea 100644 >> --- a/drivers/net/veth.c >> +++ b/drivers/net/veth.c >> @@ -17,6 +17,7 @@ >> #include <net/rtnetlink.h> >> #include <net/dst.h> >> #include <net/xfrm.h> >> +#include <net/xdp.h> >> #include <linux/veth.h> >> #include <linux/module.h> >> #include <linux/bpf.h> >> @@ -125,6 +126,11 @@ static void *veth_ptr_to_xdp(void *ptr) >> return (void *)((unsigned long)ptr & ~VETH_XDP_FLAG); >> } >> >> +static void *veth_xdp_to_ptr(void *ptr) >> +{ >> + return (void *)((unsigned long)ptr | VETH_XDP_FLAG); >> +} >> + >> static void veth_ptr_free(void *ptr) >> { >> if (veth_is_xdp_frame(ptr)) >> @@ -267,6 +273,44 @@ static struct sk_buff *veth_build_skb(void *head, int >> headroom, int len, >> return skb; >> } >> >> +static int veth_xdp_xmit(struct net_device *dev, int n, >> + struct xdp_frame **frames, u32 flags) >> +{ >> + struct veth_priv *rcv_priv, *priv = netdev_priv(dev); >> + struct net_device *rcv; >> + int i, drops = 0; >> + >> + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) >> + return -EINVAL; >> + >> + rcv = rcu_dereference(priv->peer); >> + if (unlikely(!rcv)) >> + return -ENXIO; >> + >> + rcv_priv = netdev_priv(rcv); >> + /* xdp_ring is initialized on receive side? */ >> + if (!rcu_access_pointer(rcv_priv->xdp_prog)) >> + return -ENXIO; >> + >> + spin_lock(&rcv_priv->xdp_ring.producer_lock); >> + for (i = 0; i < n; i++) { >> + struct xdp_frame *frame = frames[i]; >> + void *ptr = veth_xdp_to_ptr(frame); >> + >> + if (unlikely(xdp_ok_fwd_dev(rcv, frame->len) || >> + __ptr_ring_produce(&rcv_priv->xdp_ring, ptr))) { > > Would you mind sparing a few more words how this is safe vs the > .ndo_close() on the peer? Personally I'm a bit uncomfortable with the > IFF_UP check in xdp_ok_fwd_dev(), I'm not sure what's supposed to > guarantee the device doesn't go down right after that check, or is > already down, but netdev->flags are not atomic... > >> + xdp_return_frame_rx_napi(frame); >> + drops++; >> + } >> + } >> + spin_unlock(&rcv_priv->xdp_ring.producer_lock); >> + >> + if (flags & XDP_XMIT_FLUSH) >> + __veth_xdp_flush(rcv_priv); >> + >> + return n - drops; >> +} >> + >> static struct sk_buff *veth_xdp_rcv_one(struct veth_priv *priv, >> struct xdp_frame *frame) >> { >> @@ -760,6 +804,7 @@ static const struct net_device_ops veth_netdev_ops = { >> .ndo_features_check = passthru_features_check, >> .ndo_set_rx_headroom = veth_set_rx_headroom, >> .ndo_bpf = veth_xdp, >> + .ndo_xdp_xmit = veth_xdp_xmit, >> }; >> >> #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \ > > > -- Toshiaki Makita