On Fri, Sep 21, 2018 at 02:37:32PM +0200, Jens Freimann wrote:
> On Fri, Sep 21, 2018 at 08:26:58PM +0800, Tiwei Bie wrote:
> > On Fri, Sep 21, 2018 at 12:33:03PM +0200, Jens Freimann wrote:
> > [...]
> > > 
> > >  static inline int
> > > -desc_is_used(struct vring_desc_packed *desc, struct vring *vr)
> > > +_desc_is_used(struct vring_desc_packed *desc)
> > >  {
> > >   uint16_t used, avail;
> > > 
> > >   used = !!(desc->flags & VRING_DESC_F_USED(1));
> > >   avail = !!(desc->flags & VRING_DESC_F_AVAIL(1));
> > > 
> > > - return used == avail && used == vr->used_wrap_counter;
> > > + return used == avail;
> > > +
> > > +}
> > > +
> > > +static inline int
> > > +desc_is_used(struct vring_desc_packed *desc, struct vring *vr)
> > > +{
> > > + uint16_t used;
> > > +
> > > + used = !!(desc->flags & VRING_DESC_F_USED(1));
> > > +
> > > + return _desc_is_used(desc) && used == vr->used_wrap_counter;
> > >  }
> > > 
> > >  /* The standard layout for the ring is a continuous chunk of memory which
> > > diff --git a/drivers/net/virtio/virtio_rxtx.c 
> > > b/drivers/net/virtio/virtio_rxtx.c
> > > index eb891433e..ea6300563 100644
> > > --- a/drivers/net/virtio/virtio_rxtx.c
> > > +++ b/drivers/net/virtio/virtio_rxtx.c
> > > @@ -38,6 +38,7 @@
> > >  #define  VIRTIO_DUMP_PACKET(m, len) do { } while (0)
> > >  #endif
> > > 
> > > +
> > >  int
> > >  virtio_dev_rx_queue_done(void *rxq, uint16_t offset)
> > >  {
> > > @@ -165,6 +166,31 @@ virtqueue_dequeue_rx_inorder(struct virtqueue *vq,
> > >  #endif
> > > 
> > >  /* Cleanup from completed transmits. */
> > > +static void
> > > +virtio_xmit_cleanup_packed(struct virtqueue *vq)
> > > +{
> > > + uint16_t idx;
> > > + uint16_t size = vq->vq_nentries;
> > > + struct vring_desc_packed *desc = vq->vq_ring.desc_packed;
> > > + struct vq_desc_extra *dxp;
> > > +
> > > + idx = vq->vq_used_cons_idx;
> > > + while (_desc_is_used(&desc[idx]) &&
> > 
> > We can't just compare the AVAIL bit and USED bit to
> > check whether a desc is used.
> 
> We can't compare with the current wrap counter value as well
> because it won't match the flags in descriptors. So check against
> used_wrap_counter ^= 1 then?

I haven't looked into this series yet, so I'm not sure what's
the best way to get the wrap-counter we need here. But, yes,
you need some way to get the wrap-counter we should use here.

> > 
> > > +        vq->vq_free_cnt < size) {
> > > +         dxp = &vq->vq_descx[idx];
> > 
> > The code is still assuming the descs will be written
> > back by device in order. The vq->vq_descx[] needs to
> > be managed e.g. as a list to support the out-of-order
> > processing. IOW, we can't assume vq->vq_descx[idx]
> > is corresponding to desc[idx] when device may write
> > back the descs out of order.
> 
> I changed it to not assume this in other spots but missed this one.  I
> will check more carefully and add code to make vq_descx entries a list.

After making it support the out-of-order, we may want to do
some performance test for the Tx path only. Because I suspect
we may not be able to get the expected performance improvements
in packed ring due to this when device is faster than driver.

Thanks

Reply via email to