On Fri, Jun 28, 2019 at 10:53 AM Neil Horman <nhor...@tuxdriver.com> wrote:
>
> The AF_PACKET protocol, when running as a memory mapped socket uses a
> pending frame counter to track the number of skbs in flight during
> transmission.  It is incremented during the sendmsg call (via
> tpacket_snd), and decremented (possibly asynchronously) in the skb
> desructor during skb_free.
>
> The counter is currently implemented as a percpu variable for each open
> socket, but for reads (via packet_read_pending), we iterate over every
> cpu variable, accumulating the total pending count.
>
> Given that the socket transmit path is an exclusive path (locked via the
> pg_vec_lock mutex), we do not have the ability to increment this counter
> on multiple cpus in parallel.  This implementation also seems to have
> the potential to be broken, in that, should an skb be freed on a cpu
> other than the one that it was initially transmitted on, we may
> decrement a counter that was not initially incremented, leading to
> underflow.
>
> As such, adjust the packet socket struct to convert the per-cpu counter
> to an atomic_t variable (to enforce consistency between the send path
> and the skb free path).  This saves us some space in the packet_sock
> structure, prevents the possibility of underflow, and should reduce the
> run time of packet_read_pending, as we only need to read a single
> variable, instead of having to loop over every available cpu variable
> instance.
>
> Tested by myself by running a small program which sends frames via
> AF_PACKET on multiple cpus in parallel, with good results.
>
> Signed-off-by: Neil Horman <nhor...@tuxdriver.com>
> CC: "David S. Miller" <da...@davemloft.net>
> CC: Willem de Bruijn <will...@google.com>
> ---

This essentially is a revert of commit b013840810c2 ("packet: use
percpu mmap tx frame pending refcount"). That has some benchmark
numbers and also discusses the overflow issue.

I think more interesting would be to eschew the counter when
MSG_DONTWAIT, as it is only used to know when to exit the loop if
need_wait.

But, IMHO packet sockets are deprecated in favor of AF_XDP and
should be limited to bug fixes at this point.

>  net/packet/af_packet.c | 40 +++++-----------------------------------
>  net/packet/internal.h  |  2 +-
>  2 files changed, 6 insertions(+), 36 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 8d54f3047768..25ffb486fac9 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1154,43 +1154,17 @@ static void packet_increment_head(struct 
> packet_ring_buffer *buff)
>
>  static void packet_inc_pending(struct packet_ring_buffer *rb)
>  {
> -       this_cpu_inc(*rb->pending_refcnt);
> +       atomic_inc(&rb->pending_refcnt);
>  }

If making this change, can also remove these helper functions. The
layer of indirection just hinders readability.

Reply via email to