> -----Original Message-----
> From: Gavin Hu (Arm Technology China) [mailto:gavin...@arm.com]
> Sent: Tuesday, September 10, 2019 5:49 PM
> To: Wang, Yinan <yinan.w...@intel.com>; Maxime Coquelin
> <maxime.coque...@redhat.com>; Joyce Kong (Arm Technology China)
> <joyce.k...@arm.com>; dev@dpdk.org
> Cc: nd <n...@arm.com>; Bie, Tiwei <tiwei....@intel.com>; Wang, Zhihong
> <zhihong.w...@intel.com>; amore...@redhat.com; Wang, Xiao W
> <xiao.w.w...@intel.com>; Liu, Yong <yong....@intel.com>;
> jfreim...@redhat.com; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>;
> Steve Capper <steve.cap...@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed
> vring desc avail flags
> 
> Hi Yinan,
> 
> We have done a comparative analysis and found with the old code the
> if(weak_barriers) and else branches were saved on x86 as rte_smp_wmb and
> rte_cio_wmb are identical.
> http://git.dpdk.org/dpdk/tree/drivers/net/virtio/virtqueue.h#n49
> For the new code, with Joyce's patches applied, the branches were not saved,
> which requir additional cpu cycles, this caused slight degradation on x86.
> 
> The patches uplifted the performance on aarch64 about 9% as indicated in
> the cover letter. While I am thinking over a solution to the degradation on
> x86,could you help answer:
> 1. Is rte_cio_wmb is sufficient for the non weak-barrier case(HW
> offloading)?
>  I got this question because I see in Intel NIC PMDs, it is almost never
> used, it is rte_wmb that is more widely used to notify the NIC device, any
> difference between the virtio ring compatible smartNIC device(or vDPA?) and
> i40e like devices?

Hi Gavin,
X86 architecture can guarantee that young store happen later than old store.
So rte_cio_wmb is just compiler memory barrier in x86. 

I think compiler barrier is also enough in pmd, rte_wmb is in pmd because of it 
was inherit from first implementation :)

Thanks,
Marvin

> 2. If the rte_cio_wmb is not sufficient for this case and replaced by
> stronger barriers, like sfence,  then the branches will not be saved by the
> compiler, then the problem becomes with the correct use of barriers, other
> than the degradation.
> 
> Any comments are welcome!
> 
> Best Regards,
> Gavin
> 
> > -----Original Message-----
> > From: Wang, Yinan <yinan.w...@intel.com>
> > Sent: Tuesday, September 10, 2019 11:54 AM
> > To: Maxime Coquelin <maxime.coque...@redhat.com>; Joyce Kong (Arm
> > Technology China) <joyce.k...@arm.com>; dev@dpdk.org
> > Cc: nd <n...@arm.com>; Bie, Tiwei <tiwei....@intel.com>; Wang, Zhihong
> > <zhihong.w...@intel.com>; amore...@redhat.com; Wang, Xiao W
> > <xiao.w.w...@intel.com>; Liu, Yong <yong....@intel.com>;
> > jfreim...@redhat.com; Honnappa Nagarahalli
> > <honnappa.nagaraha...@arm.com>; Gavin Hu (Arm Technology China)
> > <gavin...@arm.com>
> > Subject: RE: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed
> vring
> > desc avail flags
> >
> >
> > Hi Joyce,
> >
> > I just test performance impact of your patch set with code base commit id:
> > d03d8622db48918d14bfe805641b1766ecc40088, after applying your v3 patch
> > set , seven paths of vhost/virtio pvp test shows performance drop as
> below:
> >
> > PVP vhost/virtio 1c1q test           before apply patch     apply patch
> > test_perf_pvp_inorder_mergeable              7.603             7.474
> > test_perf_pvp_inorder_no_mergeable       7.642                 7.525
> > test_perf_pvp_mergeable                   7.556                7.431
> > test_perf_pvp_normal                           7.554                   7.478
> > test_perf_pvp_vector_rx                    7.581               7.469
> > test_perf_pvp_virtio11_mergeable               7.068                   6.905
> > test_perf_pvp_virtio11_normal                  7.088                   6.888
> >
> > Thanks,
> > Yinan
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Maxime Coquelin
> > > Sent: 2019年9月9日 18:10
> > > To: Joyce Kong <joyce.k...@arm.com>; dev@dpdk.org
> > > Cc: n...@arm.com; Bie, Tiwei <tiwei....@intel.com>; Wang, Zhihong
> > > <zhihong.w...@intel.com>; amore...@redhat.com; Wang, Xiao W
> > > <xiao.w.w...@intel.com>; Liu, Yong <yong....@intel.com>;
> > > jfreim...@redhat.com; honnappa.nagaraha...@arm.com;
> > gavin...@arm.com
> > > Subject: Re: [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for
> packed
> > vring
> > > desc avail flags
> > >
> > >
> > >
> > > On 9/9/19 11:14 AM, Joyce Kong wrote:
> > > > In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the
> > > > frontend and backend are assumed to be implemented in software, that
> > > > is they can run on identical CPUs in an SMP configuration.
> > > > Thus a weak form of memory barriers like rte_smp_r/wmb, other than
> > > > rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
> > > > and yields better performance.
> > > > For the above case, this patch helps yielding even better performance
> > > > by replacing the two-way barriers with C11 one-way barriers for avail
> > > > flags in packed ring.
> > > >
> > > > Meanwhile, a read barrier is required to ensure ordering between
> > > > descriptor's flags and content reads[1]. With C11, load-acquire can
> > > > enforce the ordering instead of rmb barrier.
> > > >
> > > > [1]https://patchwork.dpdk.org/patch/49109/
> > > >
> > > > Signed-off-by: Joyce Kong <joyce.k...@arm.com>
> > > > Reviewed-by: Gavin Hu <gavin...@arm.com>
> > > > Reviewed-by: Phil Yang <phil.y...@arm.com>
> > > > ---
> > > >  drivers/net/virtio/virtio_rxtx.c                 | 13 +++++++------
> > > >  drivers/net/virtio/virtio_user/virtio_user_dev.c |  6 +++++-
> > > >  drivers/net/virtio/virtqueue.h                   | 11 +++++++++++
> > > >  lib/librte_vhost/vhost.h                         |  2 +-
> > > >  lib/librte_vhost/virtio_net.c                    | 11 +++++------
> > > >  5 files changed, 29 insertions(+), 14 deletions(-)
> > >
> > > Reviewed-by: Maxime Coquelin <maxime.coque...@redhat.com>
> > >
> > > Thanks,
> > > Maxime

Reply via email to