Thanks Morten, appreciate your comments. Few responses inline.

> -----Original Message-----
> From: Morten Brørup <m...@smartsharesystems.com>
> Sent: Sunday, December 26, 2021 4:25 AM
> To: Feifei Wang <feifei.wa...@arm.com>
> Cc: dev@dpdk.org; nd <n...@arm.com>
> Subject: RE: [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side
> 
> > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > Sent: Friday, 24 December 2021 17.46
> >
<snip>

> >
> > However, this solution poses several constraint:
> >
> > 1)The receive queue needs to know which transmit queue it should take
> > the buffers from. The application logic decides which transmit port to
> > use to send out the packets. In many use cases the NIC might have a
> > single port ([1], [2], [3]), in which case a given transmit queue is
> > always mapped to a single receive queue (1:1 Rx queue: Tx queue). This
> > is easy to configure.
> >
> > If the NIC has 2 ports (there are several references), then we will
> > have
> > 1:2 (RX queue: TX queue) mapping which is still easy to configure.
> > However, if this is generalized to 'N' ports, the configuration can be
> > long. More over the PMD would have to scan a list of transmit queues
> > to pull the buffers from.
> 
> I disagree with the description of this constraint.
> 
> As I understand it, it doesn't matter now many ports or queues are in a NIC or
> system.
> 
> The constraint is more narrow:
> 
> This patch requires that all packets ingressing on some port/queue must
> egress on the specific port/queue that it has been configured to ream its
> buffers from. I.e. an application cannot route packets between multiple ports
> with this patch.
Agree, this patch as is has this constraint. It is not a constraint that would 
apply for NICs with single port. The above text is describing some of the 
issues associated with generalizing the solution for N number of ports. If N is 
small, the configuration is small and scanning should not be bad.

> 
> >

<snip>

> >
> 
> You are missing the fourth constraint:
> 
> 4) The application must transmit all received packets immediately, i.e. QoS
> queueing and similar is prohibited.
I do not understand this, can you please elaborate?. Even if there is QoS 
queuing, there would be steady stream of packets being transmitted. These 
transmitted packets will fill the buffers on the RX side.

> 
<snip>

> >
> 
> The patch provides a significant performance improvement, but I am
> wondering if any real world applications exist that would use this. Only a
> "router on a stick" (i.e. a single-port router) comes to my mind, and that is
> probably sufficient to call it useful in the real world. Do you have any other
> examples to support the usefulness of this patch?
SmartNIC is a clear and dominant use case, typically they have a single port 
for data plane traffic (dual ports are mostly for redundancy)
This patch avoids good amount of store operations. The smaller CPUs found in 
SmartNICs have smaller store buffers which can become bottlenecks. Avoiding the 
lcore cache saves valuable HW cache space.

> 
> Anyway, the patch doesn't do any harm if unused, and the only performance
> cost is the "if (rxq->direct_rxrearm_enable)" branch in the Ethdev driver. So 
> I
> don't oppose to it.
> 

Reply via email to