2017-11-03 4:17 GMT+01:00 Willem de Bruijn <willemdebruijn.ker...@gmail.com>:
> On Tue, Oct 31, 2017 at 9:41 PM, Björn Töpel <bjorn.to...@gmail.com> wrote:
>> From: Björn Töpel <bjorn.to...@intel.com>
>>
>> This commits adds support for zerocopy mode. Note that zerocopy mode
>> requires that the network interface has been bound to the socket using
>> the bind syscall, and that the corresponding netdev implements the
>> AF_PACKET V4 ndos.
>>
>> Signed-off-by: Björn Töpel <bjorn.to...@intel.com>
>> ---
>> +
>> +static void packet_v4_disable_zerocopy(struct net_device *dev,
>> +                                      struct tp4_netdev_parms *zc)
>> +{
>> +       struct tp4_netdev_parms params;
>> +
>> +       params = *zc;
>> +       params.command  = TP4_DISABLE;
>> +
>> +       (void)dev->netdev_ops->ndo_tp4_zerocopy(dev, &params);
>
> Don't ignore error return codes.
>

Will fix!

>> +static int packet_v4_zerocopy(struct sock *sk, int qp)
>> +{
>> +       struct packet_sock *po = pkt_sk(sk);
>> +       struct socket *sock = sk->sk_socket;
>> +       struct tp4_netdev_parms *zc = NULL;
>> +       struct net_device *dev;
>> +       bool if_up;
>> +       int ret = 0;
>> +
>> +       /* Currently, only RAW sockets are supported.*/
>> +       if (sock->type != SOCK_RAW)
>> +               return -EINVAL;
>> +
>> +       rtnl_lock();
>> +       dev = packet_cached_dev_get(po);
>> +
>> +       /* Socket needs to be bound to an interface. */
>> +       if (!dev) {
>> +               rtnl_unlock();
>> +               return -EISCONN;
>> +       }
>> +
>> +       /* The device needs to have both the NDOs implemented. */
>> +       if (!(dev->netdev_ops->ndo_tp4_zerocopy &&
>> +             dev->netdev_ops->ndo_tp4_xmit)) {
>> +               ret = -EOPNOTSUPP;
>> +               goto out_unlock;
>> +       }
>
> Inconsistent error handling with above test.
>

Will fix.

>> +
>> +       if (!(po->rx_ring.pg_vec && po->tx_ring.pg_vec)) {
>> +               ret = -EOPNOTSUPP;
>> +               goto out_unlock;
>> +       }
>
> A ring can be unmapped later with packet_set_ring. Should that operation
> fail if zerocopy is enabled? After that, it can also change version with
> PACKET_VERSION.
>

You're correct, I've missed this. I need to revisit the scenario when
a ring is unmapped, and recreated. Thanks for pointing this out.

>> +
>> +       if_up = dev->flags & IFF_UP;
>> +       zc = rtnl_dereference(po->zc);
>> +
>> +       /* Disable */
>> +       if (qp <= 0) {
>> +               if (!zc)
>> +                       goto out_unlock;
>> +
>> +               packet_v4_disable_zerocopy(dev, zc);
>> +               rcu_assign_pointer(po->zc, NULL);
>> +
>> +               if (if_up) {
>> +                       spin_lock(&po->bind_lock);
>> +                       register_prot_hook(sk);
>> +                       spin_unlock(&po->bind_lock);
>> +               }
>
> There have been a bunch of race conditions in this bind code. We need
> to be very careful with adding more states to the locking, especially when
> open coding in multiple locations, as this patch does. I counted at least
> four bind locations. See for instance also
> http://patchwork.ozlabs.org/patch/813945/
>

Yeah, the locking schemes in AF_PACKET is pretty convoluted. I'll
document and make the locking more explicit (and avoiding open coding
it).

>
>> +
>> +               goto out_unlock;
>> +       }
>> +
>> +       /* Enable */
>> +       if (!zc) {
>> +               zc = kzalloc(sizeof(*zc), GFP_KERNEL);
>> +               if (!zc) {
>> +                       ret = -ENOMEM;
>> +                       goto out_unlock;
>> +               }
>> +       }
>> +
>> +       if (zc->queue_pair >= 0)
>> +               packet_v4_disable_zerocopy(dev, zc);
>
> This calls disable even if zc was freshly allocated.
> Shoud be > 0?
>

Good catch. It should be > 0.

>>  static int packet_release(struct socket *sock)
>>  {
>> +       struct tp4_netdev_parms *zc;
>>         struct sock *sk = sock->sk;
>> +       struct net_device *dev;
>>         struct packet_sock *po;
>>         struct packet_fanout *f;
>>         struct net *net;
>> @@ -3337,6 +3541,20 @@ static int packet_release(struct socket *sock)
>>         sock_prot_inuse_add(net, sk->sk_prot, -1);
>>         preempt_enable();
>>
>> +       rtnl_lock();
>> +       zc = rtnl_dereference(po->zc);
>> +       dev = packet_cached_dev_get(po);
>> +       if (zc && dev)
>> +               packet_v4_disable_zerocopy(dev, zc);
>> +       if (dev)
>> +               dev_put(dev);
>> +       rtnl_unlock();
>> +
>> +       if (zc) {
>> +               synchronize_rcu();
>> +               kfree(zc);
>> +       }
>
> Please use a helper function for anything this complex.

Will fix.


Thanks,
Björn

Reply via email to