On Fri, Sep 21, 2018 at 02:37:32PM +0200, Jens Freimann wrote: > On Fri, Sep 21, 2018 at 08:26:58PM +0800, Tiwei Bie wrote: > > On Fri, Sep 21, 2018 at 12:33:03PM +0200, Jens Freimann wrote: > > [...] > > > > > > static inline int > > > -desc_is_used(struct vring_desc_packed *desc, struct vring *vr) > > > +_desc_is_used(struct vring_desc_packed *desc) > > > { > > > uint16_t used, avail; > > > > > > used = !!(desc->flags & VRING_DESC_F_USED(1)); > > > avail = !!(desc->flags & VRING_DESC_F_AVAIL(1)); > > > > > > - return used == avail && used == vr->used_wrap_counter; > > > + return used == avail; > > > + > > > +} > > > + > > > +static inline int > > > +desc_is_used(struct vring_desc_packed *desc, struct vring *vr) > > > +{ > > > + uint16_t used; > > > + > > > + used = !!(desc->flags & VRING_DESC_F_USED(1)); > > > + > > > + return _desc_is_used(desc) && used == vr->used_wrap_counter; > > > } > > > > > > /* The standard layout for the ring is a continuous chunk of memory which > > > diff --git a/drivers/net/virtio/virtio_rxtx.c > > > b/drivers/net/virtio/virtio_rxtx.c > > > index eb891433e..ea6300563 100644 > > > --- a/drivers/net/virtio/virtio_rxtx.c > > > +++ b/drivers/net/virtio/virtio_rxtx.c > > > @@ -38,6 +38,7 @@ > > > #define VIRTIO_DUMP_PACKET(m, len) do { } while (0) > > > #endif > > > > > > + > > > int > > > virtio_dev_rx_queue_done(void *rxq, uint16_t offset) > > > { > > > @@ -165,6 +166,31 @@ virtqueue_dequeue_rx_inorder(struct virtqueue *vq, > > > #endif > > > > > > /* Cleanup from completed transmits. */ > > > +static void > > > +virtio_xmit_cleanup_packed(struct virtqueue *vq) > > > +{ > > > + uint16_t idx; > > > + uint16_t size = vq->vq_nentries; > > > + struct vring_desc_packed *desc = vq->vq_ring.desc_packed; > > > + struct vq_desc_extra *dxp; > > > + > > > + idx = vq->vq_used_cons_idx; > > > + while (_desc_is_used(&desc[idx]) && > > > > We can't just compare the AVAIL bit and USED bit to > > check whether a desc is used. > > We can't compare with the current wrap counter value as well > because it won't match the flags in descriptors. So check against > used_wrap_counter ^= 1 then?
I haven't looked into this series yet, so I'm not sure what's the best way to get the wrap-counter we need here. But, yes, you need some way to get the wrap-counter we should use here. > > > > > + vq->vq_free_cnt < size) { > > > + dxp = &vq->vq_descx[idx]; > > > > The code is still assuming the descs will be written > > back by device in order. The vq->vq_descx[] needs to > > be managed e.g. as a list to support the out-of-order > > processing. IOW, we can't assume vq->vq_descx[idx] > > is corresponding to desc[idx] when device may write > > back the descs out of order. > > I changed it to not assume this in other spots but missed this one. I > will check more carefully and add code to make vq_descx entries a list. After making it support the out-of-order, we may want to do some performance test for the Tx path only. Because I suspect we may not be able to get the expected performance improvements in packed ring due to this when device is faster than driver. Thanks