> -----Original Message----- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kyle Larose > Sent: Wednesday, September 16, 2015 5:05 AM > To: Thomas Monjalon > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] vhost-net stops sending to virito pmd -- already > fixed? > > On Sun, Sep 13, 2015 at 5:43 PM, Thomas Monjalon > <thomas.monjalon at 6wind.com> wrote: > > > > Hi, > > > > 2015-09-11 12:32, Kyle Larose: > > > Looking through the version tree for virtio_rxtx.c, I saw the > > > following > > > commit: > > > > > > > http://dpdk.org/browse/dpdk/commit/lib/librte_pmd_virtio?id=8c09c20f > > > b4cde76e53d87bd50acf2b441ecf6eb8 > > > > > > Does anybody know offhand if the issue fixed by that commit could be > > > the root cause of what I am seeing? > > > > I won't have the definitive answer but I would like to use your > > question to highlight a common issue in git messages: > > > > PLEASE, authors of fixes, explain the bug you are fixing and how it > > can be reproduced. Good commit messages are REALLY read and useful. > > > > Thanks > > > > I've figured out what happened. It has nothing to do with the fix I pasted > above. Instead, the issue has to do with running low on mbufs. > > Here's the general logic: > > 1. If packets are not queued, return > 2. Fetch each queued packet, as an mbuf, into the provided array. This may > involve some merging/etc 3. Try to fill the virtio receive ring with new mbufs > 3.a. If we fail to allocate an mbuf, break out of the refill loop 4. Update > the > receive ring information and kick the host > > This is obviously a simplification, but the key point is 3.a. If we hit this > logic > when the virtio receive ring is completely used up, we essentially lock up. > The host will have no buffers with which to queue packets, so the next time > we poll, we will hit case 1. However, since we hit case 1, we will not > allocate > mbufs to the virtio receive ring, regardless of how many are now free. Rinse > and repeat; we are stuck until the pmd is restarted or the link is restarted. > > This is very easy to reproduce when the mbuf pool is fairly small, and packets > are being passed to worker threads/processes which may increase the > length of the pipeline. > > I took a quick look at the ixgbe driver, and it looks like it checks if it > needs to > allocate mbufs to the ring before trying to pull packets off the nic. Should > we > not be doing something similar for virtio? Rather than breaking out early if > no > packets are queued, we should first make sure there are resources with > which to queue packets!
Try to allocate mbuf and refill the vring descriptor when 1 is hit, This way probably address your issue. > > One solution here is to increase the mbuf pool to a size where such > exhaustion is impossible, but that doesn't seem like a graceful solution. For > example, it may be desirable to drop packets rather than have a large > memory pool, and becoming stuck under such a situation is not good. Further, > it isn't easy to know the exact size required. You may end up wasting a bunch > of resources allocating far more than necessary, or you may unknowingly > under allocate, only to find out once your application has been deployed into > production, and it's dropping everything on the floor. > > Does anyone have thoughts on this? I took a look at virtio_rxtx and head and > I didn't see anything resembling my suggestion. > > Comments would be appreciated. Thanks, > > Kyle