On Wed, Dec 12, 2018 at 05:34:31PM +0100, Maxime Coquelin wrote:
> Hi Ilya,
> 
> On 12/12/18 4:23 PM, Ilya Maximets wrote:
> > On 12.12.2018 11:24, Maxime Coquelin wrote:
> > > Instead of writing back descriptors chains in order, let's
> > > write the first chain flags last in order to improve batching.
> > > 
> > > With Kernel's pktgen benchmark, ~3% performance gain is measured.
> > > 
> > > Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
> > > ---
> > >   lib/librte_vhost/virtio_net.c | 39 +++++++++++++++++++++--------------
> > >   1 file changed, 24 insertions(+), 15 deletions(-)
> > > 
> > 
> > Hi.
> > I made some rough testing on my ARMv8 system with this patch and v1 of it.
> > Here is the performance difference with current master:
> >      v1: +1.1 %
> >      v2: -3.6 %
> > 
> > So, write barriers are quiet heavy in practice.
> 
> Thanks for testing it on ARM. Indeed, SMP WMB is heavier on ARM.

Besides your ideas for improving packed rings, maybe we should switch to
load_acquite/store_release?

See
        virtio: use smp_load_acquire/smp_store_release

which worked fine but as I only tested on x86 did not result in any gains.

-- 
MST

Reply via email to