> -----Original Message-----
> From: Maxime Coquelin <maxime.coque...@redhat.com>
> Sent: Friday, April 24, 2020 9:36 PM
> To: Liu, Yong <yong....@intel.com>; Ye, Xiaolong <xiaolong...@intel.com>;
> Wang, Zhihong <zhihong.w...@intel.com>
> Cc: dev@dpdk.org; Van Haaren, Harry <harry.van.haa...@intel.com>
> Subject: Re: [PATCH v9 7/9] net/virtio: add vectorized packed ring Tx path
> 
> 
> 
> On 4/24/20 3:33 PM, Liu, Yong wrote:
> >
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coque...@redhat.com>
> >> Sent: Friday, April 24, 2020 8:30 PM
> >> To: Liu, Yong <yong....@intel.com>; Ye, Xiaolong <xiaolong...@intel.com>;
> >> Wang, Zhihong <zhihong.w...@intel.com>
> >> Cc: dev@dpdk.org; Van Haaren, Harry <harry.van.haa...@intel.com>
> >> Subject: Re: [PATCH v9 7/9] net/virtio: add vectorized packed ring Tx path
> >>
> >>
> >>
> >> On 4/24/20 11:24 AM, Marvin Liu wrote:
> >>> Optimize packed ring Tx path alike Rx path. Split Tx path into batch and
> >>
> >> s/alike/like/ ?
> >>
> >>> single Tx functions. Batch function is further optimized by AVX512
> >>> instructions.
> >>>
> >>> Signed-off-by: Marvin Liu <yong....@intel.com>
> >>>
> >>> diff --git a/drivers/net/virtio/virtio_ethdev.h
> >> b/drivers/net/virtio/virtio_ethdev.h
> >>> index 5c112cac7..b7d52d497 100644
> >>> --- a/drivers/net/virtio/virtio_ethdev.h
> >>> +++ b/drivers/net/virtio/virtio_ethdev.h
> >>> @@ -108,6 +108,9 @@ uint16_t virtio_recv_pkts_vec(void *rx_queue,
> >> struct rte_mbuf **rx_pkts,
> >>>  uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf
> >> **rx_pkts,
> >>>           uint16_t nb_pkts);
> >>>
> >>> +uint16_t virtio_xmit_pkts_packed_vec(void *tx_queue, struct rte_mbuf
> >> **tx_pkts,
> >>> +         uint16_t nb_pkts);
> >>> +
> >>>  int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
> >>>
> >>>  void virtio_interrupt_handler(void *param);
> >>> diff --git a/drivers/net/virtio/virtio_rxtx.c
> b/drivers/net/virtio/virtio_rxtx.c
> >>> index cf18fe564..f82fe8d64 100644
> >>> --- a/drivers/net/virtio/virtio_rxtx.c
> >>> +++ b/drivers/net/virtio/virtio_rxtx.c
> >>> @@ -2175,3 +2175,11 @@ virtio_recv_pkts_packed_vec(void *rx_queue
> >> __rte_unused,
> >>>  {
> >>>   return 0;
> >>>  }
> >>> +
> >>> +__rte_weak uint16_t
> >>> +virtio_xmit_pkts_packed_vec(void *tx_queue __rte_unused,
> >>> +                     struct rte_mbuf **tx_pkts __rte_unused,
> >>> +                     uint16_t nb_pkts __rte_unused)
> >>> +{
> >>> + return 0;
> >>> +}
> >>> diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c
> >> b/drivers/net/virtio/virtio_rxtx_packed_avx.c
> >>> index 8a7b459eb..c023ace4e 100644
> >>> --- a/drivers/net/virtio/virtio_rxtx_packed_avx.c
> >>> +++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c
> >>> @@ -23,6 +23,24 @@
> >>>  #define PACKED_FLAGS_MASK ((0ULL |
> >> VRING_PACKED_DESC_F_AVAIL_USED) << \
> >>>   FLAGS_BITS_OFFSET)
> >>>
> >>> +/* reference count offset in mbuf rearm data */
> >>> +#define REFCNT_BITS_OFFSET ((offsetof(struct rte_mbuf, refcnt) - \
> >>> + offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
> >>> +/* segment number offset in mbuf rearm data */
> >>> +#define SEG_NUM_BITS_OFFSET ((offsetof(struct rte_mbuf, nb_segs) - \
> >>> + offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
> >>> +
> >>> +/* default rearm data */
> >>> +#define DEFAULT_REARM_DATA (1ULL << SEG_NUM_BITS_OFFSET | \
> >>> + 1ULL << REFCNT_BITS_OFFSET)
> >>> +
> >>> +/* id bits offset in packed ring desc higher 64bits */
> >>> +#define ID_BITS_OFFSET ((offsetof(struct vring_packed_desc, id) - \
> >>> + offsetof(struct vring_packed_desc, len)) * BYTE_SIZE)
> >>> +
> >>> +/* net hdr short size mask */
> >>> +#define NET_HDR_MASK 0x3F
> >>> +
> >>>  #define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \
> >>>   sizeof(struct vring_packed_desc))
> >>>  #define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1)
> >>> @@ -47,6 +65,48 @@
> >>>   for (iter = val; iter < num; iter++)
> >>>  #endif
> >>>
> >>> +static inline void
> >>> +virtio_xmit_cleanup_packed_vec(struct virtqueue *vq)
> >>> +{
> >>> + struct vring_packed_desc *desc = vq->vq_packed.ring.desc;
> >>> + struct vq_desc_extra *dxp;
> >>> + uint16_t used_idx, id, curr_id, free_cnt = 0;
> >>> + uint16_t size = vq->vq_nentries;
> >>> + struct rte_mbuf *mbufs[size];
> >>> + uint16_t nb_mbuf = 0, i;
> >>> +
> >>> + used_idx = vq->vq_used_cons_idx;
> >>> +
> >>> + if (!desc_is_used(&desc[used_idx], vq))
> >>> +         return;
> >>> +
> >>> + id = desc[used_idx].id;
> >>> +
> >>> + do {
> >>> +         curr_id = used_idx;
> >>> +         dxp = &vq->vq_descx[used_idx];
> >>> +         used_idx += dxp->ndescs;
> >>> +         free_cnt += dxp->ndescs;
> >>> +
> >>> +         if (dxp->cookie != NULL) {
> >>> +                 mbufs[nb_mbuf] = dxp->cookie;
> >>> +                 dxp->cookie = NULL;
> >>> +                 nb_mbuf++;
> >>> +         }
> >>> +
> >>> +         if (used_idx >= size) {
> >>> +                 used_idx -= size;
> >>> +                 vq->vq_packed.used_wrap_counter ^= 1;
> >>> +         }
> >>> + } while (curr_id != id);
> >>> +
> >>> + for (i = 0; i < nb_mbuf; i++)
> >>> +         rte_pktmbuf_free(mbufs[i]);
> >>> +
> >>> + vq->vq_used_cons_idx = used_idx;
> >>> + vq->vq_free_cnt += free_cnt;
> >>> +}
> >>> +
> >>
> >>
> >> I think you can re-use the inlined non-vectorized cleanup function here.
> >> Or use your implementation in non-vectorized path.
> >> BTW, do you know we have to pass the num argument in non-vectorized
> >> case? I'm not sure to remember.
> >>
> >
> > Maxime,
> > This is simple version of xmit clean up function. It is based on the concept
> that backend will update used id in burst which also match frontend's
> requirement.
> 
> And what the backend doesn't follow that concept?
> It is just slower or broken?

It is just slower. More packets maybe drop due to no free room in the ring. 
I will replace vectorized with non-vectorized version it shown good number.

> 
> > I just found original version work better in loopback case. Will adapt it in
> next version.
> >
> > Thanks,
> > Marvin
> >
> >> Maxime
> >

Reply via email to