> -----Original Message-----
> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> Sent: Tuesday, May 17, 2016 4:09 PM
> To: dev@openvswitch.org; Daniele Di Proietto <diproiet...@vmware.com>
> Cc: Dyasly Sergey <s.dya...@samsung.com>; Heetae Ahn
> <heetae82....@samsung.com>; Flavio Leitner <f...@sysclose.org>;
> Traynor, Kevin <kevin.tray...@intel.com>; Pravin B Shelar
> <pshe...@ovn.org>; Ilya Maximets <i.maxim...@samsung.com>
> Subject: [PATCH] netdev-dpdk: Fix PMD threads hang in
> __netdev_dpdk_vhost_send().
> 
> There are situations when PMD thread can hang forever inside
> __netdev_dpdk_vhost_send() because of broken virtqueue ring.
> 
> This happens if rte_vring_available_entries() always positive and
> rte_vhost_enqueue_burst() can't send anything (possible with broken
> ring).
> 
> In this case time expiration will be never checked and 'do {} while
> (cnt)'
> loop will be infinite.
> 
> This scenario sometimes reproducible with dpdk-16.04-rc2 inside guest
> VM.
> Also it may be reproduced by manual braking of ring structure inside
> the guest VM.
> 
> Fix that by checking time expiration even if we have available
> entries.

Hi Ilya,

Thanks for catching this. This intersects with something else I've seen
wrt retry code and there's a few options...

1. Remove retries when nothing sent. For the VM that needs retries it is a
good thing to have, but Bhanu and I saw in a test with multiple VM's recently
that if one VM causes a lot of retries there is a large performance degradation
for the other VM's. So I changed the retry to only occur when at least one 
packet
has been sent on the previous call. I put a patch up here.
http://openvswitch.org/pipermail/dev/2016-May/071517.html

If we keep retries we can either

2. Make more robust coordination between rte_ring_available_entries() and
rte_vhost_enqueue_burst(), as per your patch.

3. As you've shown that we can't rely on the rte_ring_available_entries() to 
know we
can enqueue, how about just remove it and use rte_vhost_enqueue_burst() directly
in the retry loop.

My preference would be for 1. because on balance I'd rather one VM did not 
degrade
performance of others, more than I'd like it to have retries. Of course there 
could
be some compromise between them as well i.e. reduce amount of retries, but any 
retries
could affect performance for another path if they are using the same core.

What do you think?

Kevin. 

> 
> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com>
> ---
> 
> How to reproduce manually:
> 
> * Start packet flow.
> 
> * Start testpmd inside guest VM under gdb:
>       # gdb --args ./testpmd -c 0x1f -n 2 --socket-mem=2048 -w
> 0000:00:0a.0 \\
>         -- --burst=64 --txd=512 --rxd=512
> 
> * Break virtqueue ring somehow:
>       ^C
>       # (gdb) break virtqueue_enqueue_xmit
>       # (gdb) p txvq->vq_ring.avail->idx
>       $1 = 30784
>       # (gdb) p &txvq->vq_ring.avail->idx
>       $2 = (uint16_t *) 0x7fff7ea9b002
>       # (gdb) set {uint16_t}0x7fff7ea9b002 = 0
>       # (gdb) disable 1
>       # (gdb) c
>       Continuing.
> 
> * Hardly stop testpmd:
>       ^C
>       # (gdb) quit
>       A debugging session is active.
>       Quit anyway? (y or n) y
> 
> * Start testpmd inside VM again and see results:
>       # gdb --args ./testpmd -c 0x1f -n 2 --socket-mem=2048 -w
> 0000:00:0a.0 \\
>         -- --burst=64 --txd=512 --rxd=512
>       EAL: Detected 5 lcore(s)
>       EAL: Probing VFIO support...
>       EAL: PCI device 0000:00:0a.0 on NUMA socket -1
>       EAL:   probe driver: 1af4:1000 rte_virtio_pmd
>       *******SSH hangs and QEMU crashes here*******
> 
> * On the OVS side we can see hang of PMD thread:
>       |00003|ovs_rcu(urcu2)|WARN|blocked 4005 ms waiting for pmd54 to
> quiesce
>       |00004|ovs_rcu(urcu2)|WARN|blocked 8005 ms waiting for pmd54 to
> quiesce
>       |00005|ovs_rcu(urcu2)|WARN|blocked 16004 ms waiting for pmd54 to
> quiesce
>       ....
> 
> 
> lib/netdev-dpdk.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 87879d5..aef6ea4 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1367,24 +1367,24 @@ __netdev_dpdk_vhost_send(struct netdev
> *netdev, int qid,
>              cur_pkts = &cur_pkts[tx_pkts];
>          } else {
>              uint64_t timeout = VHOST_ENQ_RETRY_USECS *
> rte_get_timer_hz() / 1E6;
> -            unsigned int expired = 0;
> +            bool expired = false;
> 
> -            if (!start) {
> +            if (start) {
> +                expired = (rte_get_timer_cycles() - start) > timeout;
> +            } else {
>                  start = rte_get_timer_cycles();
>              }
> -
>              /*
>               * Unable to enqueue packets to vhost interface.
>               * Check available entries before retrying.
>               */
> -            while (!rte_vring_available_entries(virtio_dev,
> vhost_qid)) {
> -                if (OVS_UNLIKELY((rte_get_timer_cycles() - start) >
> timeout)) {
> -                    expired = 1;
> -                    break;
> -                }
> +            while (!rte_vring_available_entries(virtio_dev,
> vhost_qid)
> +                   && !expired) {
> +                expired = (rte_get_timer_cycles() - start) > timeout;
>              }
> +
>              if (expired) {
> -                /* break out of main loop. */
> +                /* Retrying took too much time. Break out of main
> loop. */
>                  break;
>              }
>          }
> --
> 2.5.0

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to