On 02.06.2016 19:22, Rich Lane wrote: > On Thu, Jun 2, 2016 at 3:46 AM, Ilya Maximets <i.maximets at samsung.com > <mailto:i.maximets at samsung.com>> wrote: > > Hi, Rich. > Thank you for testing and analysing. > > On 01.06.2016 01:06, Rich Lane wrote: > > On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets <i.maximets at > samsung.com <mailto:i.maximets at samsung.com> <mailto:i.maximets at > samsung.com <mailto:i.maximets at samsung.com>>> wrote: > > > > In current implementation guest application can reinitialize vrings > > by executing start after stop. In the same time host application > > can still poll virtqueue while device stopped in guest and it will > > crash with segmentation fault while vring reinitialization because > > of dereferencing of bad descriptor addresses. > > > > > > I see a performance regression with this patch at large packet sizes (> > 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely, > there's actually a ~1% performance improvement at small packet sizes. > > > > The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1. > > > > AFAICT this is just the compiler generating bad code. One difference is > that it's storing the offset on the stack instead of in a register. A > workaround is to move the !desc_addr check outside the unlikely macros. > > > > --- a/lib/librte_vhost/vhost_rxtx.c > > +++ b/lib/librte_vhost/vhost_rxtx.c > > @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, > struct vhost_virtqueue *vq, > > struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, > 0, 0}, 0}; > > > > desc = &vq->desc[desc_idx]; > > - if (unlikely(desc->len < vq->vhost_hlen)) > > + desc_addr = gpa_to_vva(dev, desc->addr); > > + if (unlikely(desc->len < vq->vhost_hlen || !desc_addr)) > > > > > > Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) || > !desc_addr)". > > > > return -1; > > > > > > - desc_addr = gpa_to_vva(dev, desc->addr); > > rte_prefetch0((void *)(uintptr_t)desc_addr); > > > > virtio_enqueue_offload(m, &virtio_hdr.hdr); > > @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, > struct vhost_virtqueue *vq, > > > > desc = &vq->desc[desc->next]; > > desc_addr = gpa_to_vva(dev, desc->addr); > > + if (unlikely(!desc_addr)) > > > > > > Workaround: change to "if (!desc_addr)". > > > > > > + return -1; > > + > > desc_offset = 0; > > desc_avail = desc->len; > > } > > > > What about other places? Is there same issues or it's only inside > copy_mbuf_to_desc() ? > > > Only copy_mbuf_to_desc has the issue.
Ok. Actually, I can't reproduce this performance issue using gcc 4.8.5 from RHEL 7.2. I'm not sure if I should post v2 with above fixes. May be them could be applied while pushing patch to repository? Best regards, Ilya Maximets.