2017-11-03 4:17 GMT+01:00 Willem de Bruijn <willemdebruijn.ker...@gmail.com>: > On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.to...@gmail.com> wrote: >> From: Björn Töpel <bjorn.to...@intel.com> >> >> This commits adds support for zerocopy mode. Note that zerocopy mode >> requires that the network interface has been bound to the socket using >> the bind syscall, and that the corresponding netdev implements the >> AF_PACKET V4 ndos. >> >> Signed-off-by: Björn Töpel <bjorn.to...@intel.com> >> --- >> + >> +static void packet_v4_disable_zerocopy(struct net_device *dev, >> + struct tp4_netdev_parms *zc) >> +{ >> + struct tp4_netdev_parms params; >> + >> + params = *zc; >> + params.command = TP4_DISABLE; >> + >> + (void)dev->netdev_ops->ndo_tp4_zerocopy(dev, ¶ms); > > Don't ignore error return codes. >
Will fix! >> +static int packet_v4_zerocopy(struct sock *sk, int qp) >> +{ >> + struct packet_sock *po = pkt_sk(sk); >> + struct socket *sock = sk->sk_socket; >> + struct tp4_netdev_parms *zc = NULL; >> + struct net_device *dev; >> + bool if_up; >> + int ret = 0; >> + >> + /* Currently, only RAW sockets are supported.*/ >> + if (sock->type != SOCK_RAW) >> + return -EINVAL; >> + >> + rtnl_lock(); >> + dev = packet_cached_dev_get(po); >> + >> + /* Socket needs to be bound to an interface. */ >> + if (!dev) { >> + rtnl_unlock(); >> + return -EISCONN; >> + } >> + >> + /* The device needs to have both the NDOs implemented. */ >> + if (!(dev->netdev_ops->ndo_tp4_zerocopy && >> + dev->netdev_ops->ndo_tp4_xmit)) { >> + ret = -EOPNOTSUPP; >> + goto out_unlock; >> + } > > Inconsistent error handling with above test. > Will fix. >> + >> + if (!(po->rx_ring.pg_vec && po->tx_ring.pg_vec)) { >> + ret = -EOPNOTSUPP; >> + goto out_unlock; >> + } > > A ring can be unmapped later with packet_set_ring. Should that operation > fail if zerocopy is enabled? After that, it can also change version with > PACKET_VERSION. > You're correct, I've missed this. I need to revisit the scenario when a ring is unmapped, and recreated. Thanks for pointing this out. >> + >> + if_up = dev->flags & IFF_UP; >> + zc = rtnl_dereference(po->zc); >> + >> + /* Disable */ >> + if (qp <= 0) { >> + if (!zc) >> + goto out_unlock; >> + >> + packet_v4_disable_zerocopy(dev, zc); >> + rcu_assign_pointer(po->zc, NULL); >> + >> + if (if_up) { >> + spin_lock(&po->bind_lock); >> + register_prot_hook(sk); >> + spin_unlock(&po->bind_lock); >> + } > > There have been a bunch of race conditions in this bind code. We need > to be very careful with adding more states to the locking, especially when > open coding in multiple locations, as this patch does. I counted at least > four bind locations. See for instance also > http://patchwork.ozlabs.org/patch/813945/ > Yeah, the locking schemes in AF_PACKET is pretty convoluted. I'll document and make the locking more explicit (and avoiding open coding it). > >> + >> + goto out_unlock; >> + } >> + >> + /* Enable */ >> + if (!zc) { >> + zc = kzalloc(sizeof(*zc), GFP_KERNEL); >> + if (!zc) { >> + ret = -ENOMEM; >> + goto out_unlock; >> + } >> + } >> + >> + if (zc->queue_pair >= 0) >> + packet_v4_disable_zerocopy(dev, zc); > > This calls disable even if zc was freshly allocated. > Shoud be > 0? > Good catch. It should be > 0. >> static int packet_release(struct socket *sock) >> { >> + struct tp4_netdev_parms *zc; >> struct sock *sk = sock->sk; >> + struct net_device *dev; >> struct packet_sock *po; >> struct packet_fanout *f; >> struct net *net; >> @@ -3337,6 +3541,20 @@ static int packet_release(struct socket *sock) >> sock_prot_inuse_add(net, sk->sk_prot, -1); >> preempt_enable(); >> >> + rtnl_lock(); >> + zc = rtnl_dereference(po->zc); >> + dev = packet_cached_dev_get(po); >> + if (zc && dev) >> + packet_v4_disable_zerocopy(dev, zc); >> + if (dev) >> + dev_put(dev); >> + rtnl_unlock(); >> + >> + if (zc) { >> + synchronize_rcu(); >> + kfree(zc); >> + } > > Please use a helper function for anything this complex. Will fix. Thanks, Björn