Hi Honnappa,

Inline comments...

> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Sent: Saturday, September 19, 2020 12:49 AM
> To: Phil Yang <phil.y...@arm.com>; Jakub Grajciar -X (jgrajcia - PANTHEON
> TECH SRO at Cisco) <jgraj...@cisco.com>; dev@dpdk.org
> Cc: Ruifeng Wang <ruifeng.w...@arm.com>; nd <n...@arm.com>; Honnappa
> Nagarahalli <honnappa.nagaraha...@arm.com>; nd <n...@arm.com>
> Subject: RE: [PATCH] net/memif: relax barrier for zero copy path
> 
> Hi Jakub,
>       I am trying to review this patch. I am having difficulty in 
> understanding
> the implementation for the queue/ring, appreciate if you could help me
> understand the logic.

'ring' refers to a ring buffer holding packet descriptors. These descriptors 
hold metadata about the packet (packet buffer address, length, etc..).
'queues' are a representation of rings and buffers  (+ some metadata). In more 
detail, one ring (S2M) and packet buffers allocated for this ring would be 
represented as 'tx queue' for the slave and 'rx queue' for the master.

> 
> 1) The S2M queues - are used to send packets from slave to master. My
> understanding is that, the slave thread would call 'eth_memif_tx_zc' and the
> master thread would call 'eth_memif_rx_zc'. Is this correct?
> 2) The M2S queues - are used to send packets from master to slave. Here the
> slave thread would call 'eth_memif_rx_zc' and the master thread would call
> 'eth_memif_tx_zc'. Is this correct?

This is inded correct.

> 
> Thank you,
> Honnappa
> 
> > -----Original Message-----
> > From: Phil Yang <phil.y...@arm.com>
> > Sent: Friday, September 11, 2020 12:38 AM
> > To: jgraj...@cisco.com; dev@dpdk.org
> > Cc: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Ruifeng Wang
> > <ruifeng.w...@arm.com>; nd <n...@arm.com>
> > Subject: [PATCH] net/memif: relax barrier for zero copy path
> >
> > Using 'rte_mb' to synchronize the shared ring head/tail between
> > producer and consumer will stall the pipeline and damage performance
> > on the weak memory model platforms, such like aarch64.
> >
> > Relax the expensive barrier with c11 atomic with explicit memory
> > ordering can improve 3.6% performance on throughput.

My question here is: `rte_mb` is supposed to make sure that head/tail pointer 
are not updated before the packets are written into shared memory. Does the 
atomic ensures that the packets are written into shared memory before head/tail 
pointers are updated?

Reply via email to