On 17.05.2022 18:29, Pavel Tikhomirov wrote:
We can see a hang for 40 sec in (vz7.185.3) in vhost_net_set_backend:

crash> ps -m | grep UN
[0 00:00:40.157] [UN]  PID: 38937  TASK: ffff9b7f54ca8000  CPU: 2   COMMAND: 
"qemu-kvm"

[<ffffffffc09c04a5>] vhost_net_ubuf_put_and_wait+0x65/0xb0 [vhost_net]
[<ffffffffc09c2451>] vhost_net_ioctl+0x661/0x890 [vhost_net]
[<ffffffff8fc89d80>] do_vfs_ioctl+0x3b0/0x5b0
[<ffffffff8fc8a021>] SyS_ioctl+0xa1/0xc0

while waiting for ubufs->refcount to get to zero.

In crash we see an inprogress ubuf_info generated by handle_tx(), which
is holding a refcount on ubufs, and should release it from
vhost_zerocopy_callback() but yet didn't to it. We also see the
skb_shared_info referencing this ubuf_info with zerocopy flag set. And
finally we see an sk_buff referencing this skb_shared_info. This sk_buff
had passed through validate_xmit_skb_list() as it has ->next == NULL and
->prev == skb. So it also passed dev_hard_start_xmit() already in
sch_direct_xmit(), and by looking on sk_buff's dev's netdev_queue's
trans_start we can see that it happened exactly 40 sec earlier. The
netdev_queue has virtnet_netdev ops so everything happens in virtio_net.

Considering all the above I believe that we have the same thing as in
tun devices, seems that after virtio_net xmit the sk_buff can be held
for long and thus we need to orphan frags to release ubufs earlier.

See similar patches for tun:
868eefeb17d4 ("tun: orphan frags on xmit")
149d36f7187c ("tun: report orphan frags errors to zero copy callback")

https://pmc.acronis.com/browse/VSTOR-52993

Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
---
  drivers/net/virtio_net.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 652f5b1b218c..5d22e19ad161 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -790,16 +790,21 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
net_device *dev)
        /* Free up any pending old buffers before queueing new ones. */
        free_old_xmit_skbs(sq);
+ if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
+               goto drop;
+

This smells like the SKB data will be copied for EACH skb and thus
"zero copy" will not be zero copy anymore.

Am I correct?

        /* Try to transmit */
        err = xmit_skb(sq, skb);
/* This should not happen! */
        if (unlikely(err)) {
+drop:
                dev->stats.tx_fifo_errors++;
                if (net_ratelimit())
                        dev_warn(&dev->dev,
                                 "Unexpected TXQ (%d) queue failure: %d\n", 
qnum, err);
                dev->stats.tx_dropped++;
+               skb_tx_error(skb);
                kfree_skb(skb);
                return NETDEV_TX_OK;
        }

_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to