On Thu, Sep 15, 2016 at 06:38:06PM +0200, Maxime Coquelin wrote: > >>>+static inline void __attribute__((always_inline)) > >>>+flush_used_ring(struct virtio_net *dev, struct vhost_virtqueue *vq, > >>>+ uint32_t used_idx_start) > >>>+{ > >>>+ if (used_idx_start + vq->shadow_used_idx < vq->size) { > >>>+ rte_memcpy(&vq->used->ring[used_idx_start], > >>>+ &vq->shadow_used_ring[0], > >>>+ vq->shadow_used_idx * > >>>+ sizeof(struct vring_used_elem)); > >>>+ vhost_log_used_vring(dev, vq, > >>>+ offsetof(struct vring_used, > >>>+ ring[used_idx_start]), > >>>+ vq->shadow_used_idx * > >>>+ sizeof(struct vring_used_elem)); > >>>+ } else { > >>>+ uint32_t part_1 = vq->size - used_idx_start; > >>>+ uint32_t part_2 = vq->shadow_used_idx - part_1; > >>>+ > >>>+ rte_memcpy(&vq->used->ring[used_idx_start], > >>>+ &vq->shadow_used_ring[0], > >>>+ part_1 * > >>>+ sizeof(struct vring_used_elem)); > >>>+ vhost_log_used_vring(dev, vq, > >>>+ offsetof(struct vring_used, > >>>+ ring[used_idx_start]), > >>>+ part_1 * > >>>+ sizeof(struct vring_used_elem)); > >>>+ rte_memcpy(&vq->used->ring[0], > >>>+ &vq->shadow_used_ring[part_1], > >>>+ part_2 * > >>>+ sizeof(struct vring_used_elem)); > >>>+ vhost_log_used_vring(dev, vq, > >>>+ offsetof(struct vring_used, > >>>+ ring[0]), > >>>+ part_2 * > >>>+ sizeof(struct vring_used_elem)); > >>>+ } > >>> } > >>Is expanding the code done for performance purpose? > > > >Hi Maxime, > > > >Yes theoretically this has the least branch number. > >And I think the logic is simpler this way. > Ok, in that case, maybe you could create a function to > do the rte_memcpy and the vhost_log_used on a given range.
Agreed, that will be better; it could avoid repeating similar code block 3 times. > I don't have a strong opinion on this, if Yuanhan is fine > with current code, that's ok for me. >From what I know, that's kind of DPDK prefered way, to expand code when necessary. For example, 9ec201f5d6e7 ("mbuf: provide bulk allocation"). So I'm fine with it. --yliu