On Wed, 16 Oct 2019 16:08:54 +0200 Ilya Maximets <i.maxim...@ovn.org> wrote:
> On 16.10.2019 16:02, Flavio Leitner wrote: > > On Wed, 16 Oct 2019 15:46:15 +0200 > > Maxime Coquelin <maxime.coque...@redhat.com> wrote: > > > >> On 10/16/19 3:32 PM, Ilya Maximets wrote: > >>> On 16.10.2019 13:13, Maxime Coquelin wrote: > >>>> > >>>> > >>>> On 10/15/19 8:59 PM, Flavio Leitner wrote: > >>>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data. > >>>>> If the data fits into a buffer, then all data is copied and a > >>>>> single linear buffer is returned. Otherwise it allocates > >>>>> additional mbufs and chains them together to return a multiple > >>>>> segments mbuf. > >>>>> > >>>>> While that covers most use cases, it forces applications that > >>>>> need to work with larger data sizes to support multiple segments > >>>>> mbufs. The non-linear characteristic brings complexity and > >>>>> performance implications to the application. > >>>>> > >>>>> To resolve the issue, add support to attach external buffer > >>>>> to a pktmbuf and let the host provide during registration if > >>>>> attaching an external buffer to pktmbuf is supported and if > >>>>> only linear buffer are supported. > >>>>> > >>>>> Signed-off-by: Flavio Leitner <f...@sysclose.org> > >>>>> --- > >>>>> doc/guides/prog_guide/vhost_lib.rst | 35 +++++++++ > >>>>> lib/librte_vhost/rte_vhost.h | 4 + > >>>>> lib/librte_vhost/socket.c | 22 ++++++ > >>>>> lib/librte_vhost/vhost.c | 22 ++++++ > >>>>> lib/librte_vhost/vhost.h | 4 + > >>>>> lib/librte_vhost/virtio_net.c | 109 > >>>>> ++++++++++++++++++++++++---- 6 files changed, 182 insertions(+), > >>>>> 14 deletions(-) > >>>>> > >>>>> - Changelog: > >>>>> v5: > >>>>> - fixed to destroy mutex if incompatible flags > >>>>> v4: > >>>>> - allow to use pktmbuf if there is exact space > >>>>> - removed log message if the buffer is too big > >>>>> - fixed the length to include align padding > >>>>> - free allocated buf if shinfo fails > >>>>> v3: > >>>>> - prevent the new features to be used with zero copy > >>>>> - fixed sizeof() usage > >>>>> - fixed log msg indentation > >>>>> - removed/replaced asserts > >>>>> - used the correct virt2iova function > >>>>> - fixed the patch's title > >>>>> - OvS PoC code: > >>>>> https://github.com/fleitner/ovs/tree/rte_malloc-v3 > >>>>> v2: > >>>>> - Used rte_malloc() instead of another mempool as > >>>>> suggested by Shahaf. > >>>>> - Added the documentation section. > >>>>> - Using driver registration to negotiate the features. > >>>>> - OvS PoC code: > >>>>> > >>>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7 > >>>>> > >>>> > >>>> Applied to dpdk-next-virtio/master. > >>> > >>> Thanks Maxime, > >>> > >>> But can we return back the mbuf allocation failure message? > >> > >> Good point, I missed it. > >> > >>> I mean: > >>> > >>> diff --git a/lib/librte_vhost/virtio_net.c > >>> b/lib/librte_vhost/virtio_net.c index 66f0c7206..f8af4e0b3 100644 > >>> --- a/lib/librte_vhost/virtio_net.c > >>> +++ b/lib/librte_vhost/virtio_net.c > >>> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net > >>> *dev, struct rte_mempool *mp, > >>> { > >>> struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp); > >>> > >>> - if (unlikely(pkt == NULL)) > >>> + if (unlikely(pkt == NULL)) { > >>> + RTE_LOG(ERR, VHOST_DATA, > >>> + "Failed to allocate memory for mbuf.\n"); > >>> return NULL; > >>> + } > >>> > >>> if (rte_pktmbuf_tailroom(pkt) >= data_len) > >>> return pkt; > >>> --- > >>> > >>> It's a hard failure that highlights some significant issues with > >>> memory pool size or a mbuf leak. > >> > >> I agree, we need this error message. > > > > If it runs out of memory and there are many packets coming, then it > > will flood the logs. Maybe we could use something to report in a > > more moderated way, for example: > > > > diff --git a/lib/librte_vhost/virtio_net.c > > b/lib/librte_vhost/virtio_net.c index da69ab1db..b1572b3a4 100644 > > --- a/lib/librte_vhost/virtio_net.c > > +++ b/lib/librte_vhost/virtio_net.c > > @@ -1354,8 +1354,14 @@ virtio_dev_pktmbuf_alloc(struct virtio_net > > *dev, struct rte_mempool *mp, { > > struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp); > > > > - if (unlikely(pkt == NULL)) > > + if (unlikely(pkt == NULL)) { > > + static int rate_lim = 0; > > + > > + if (rate_lim++ % 1000 == 0) > > + RTE_LOG... > > + > > return NULL; > > + } > > > > if (rte_pktmbuf_tailroom(pkt) >= data_len) > > return pkt; > > > > > > Or track this at mempool level and keep stats of failed requests and > > report that in OvS. That would cover other points too. > > OVS is already rete-limiting DPDK logs. Basically, I introduced this > functionality to limit exactly this message. > See: 9fd38f6867df ("netdev-dpdk: Limit rate of DPDK logs.") Thanks, I missed that commit. fbl