On Thu, Mar 03, 2016 at 04:21:19PM +0000, Xie, Huawei wrote: > On 2/18/2016 9:48 PM, Yuanhan Liu wrote: > > The current rte_vhost_dequeue_burst() implementation is a bit messy > > and logic twisted. And you could see repeat code here and there: it > > invokes rte_pktmbuf_alloc() three times at three different places! > > > > However, rte_vhost_dequeue_burst() acutally does a simple job: copy > > the packet data from vring desc to mbuf. What's tricky here is: > > > > - desc buff could be chained (by desc->next field), so that you need > > fetch next one if current is wholly drained. > > > > - One mbuf could not be big enough to hold all desc buff, hence you > > need to chain the mbuf as well, by the mbuf->next field. > > > > Even though, the logic could be simple. Here is the pseudo code. > > > > while (this_desc_is_not_drained_totally || has_next_desc) { > > if (this_desc_has_drained_totally) { > > this_desc = next_desc(); > > } > > > > if (mbuf_has_no_room) { > > mbuf = allocate_a_new_mbuf(); > > } > > > > COPY(mbuf, desc); > > } > > > > And this is how I refactored rte_vhost_dequeue_burst. > > > > Note that the old patch does a special handling for skipping virtio > > header. However, that could be simply done by adjusting desc_avail > > and desc_offset var: > > > > desc_avail = desc->len - vq->vhost_hlen; > > desc_offset = vq->vhost_hlen; > > > > This refactor makes the code much more readable (IMO), yet it reduces > > binary code size (nearly 2K). > > > > Signed-off-by: Yuanhan Liu <yuanhan.liu at linux.intel.com> > > --- > > > > v2: - fix potential NULL dereference bug of var "prev" and "head" > > --- > > lib/librte_vhost/vhost_rxtx.c | 297 > > +++++++++++++++++------------------------- > > 1 file changed, 116 insertions(+), 181 deletions(-) > > > > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > > index 5e7e5b1..d5cd0fa 100644 > > --- a/lib/librte_vhost/vhost_rxtx.c > > +++ b/lib/librte_vhost/vhost_rxtx.c > > @@ -702,21 +702,104 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, > > struct rte_mbuf *m) > > } > > } > > > > +static inline struct rte_mbuf * > > +copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, > > + uint16_t desc_idx, struct rte_mempool *mbuf_pool) > > +{ > > + struct vring_desc *desc; > > + uint64_t desc_addr; > > + uint32_t desc_avail, desc_offset; > > + uint32_t mbuf_avail, mbuf_offset; > > + uint32_t cpy_len; > > + struct rte_mbuf *head = NULL; > > + struct rte_mbuf *cur = NULL, *prev = NULL; > > + struct virtio_net_hdr *hdr; > > + > > + desc = &vq->desc[desc_idx]; > > + desc_addr = gpa_to_vva(dev, desc->addr); > > + rte_prefetch0((void *)(uintptr_t)desc_addr); > > + > > + /* Retrieve virtio net header */ > > + hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr); > > + desc_avail = desc->len - vq->vhost_hlen; > > There is a serious bug here, desc->len - vq->vhost_len could overflow. > VM could easily create this case. Let us fix it here.
Nope, this issue has been there since the beginning, and this patch is a refactor: we should not bring any functional changes. Therefore, we should not fix it here. And actually, it's been fixed in the 6th patch in this series: [PATCH v2 6/7] vhost: do sanity check for desc->len --yliu