> > On 3/6/2023 1:26 PM, Morten Brørup wrote: > > >> From: Ferruh Yigit [mailto:ferruh.yi...@amd.com] > > >> Sent: Monday, 6 March 2023 13.49 > > >> > > >> On 1/4/2023 8:21 AM, Morten Brørup wrote: > > >>>> From: Feifei Wang [mailto:feifei.wa...@arm.com] > > >>>> Sent: Wednesday, 4 January 2023 08.31 > > >>>> > > >>>> Add 'tx_fill_sw_ring' and 'rx_flush_descriptor' API into direct > > >>>> rearm mode for separate Rx and Tx Operation. And this can support > > >>>> different multiple sources in direct rearm mode. For examples, Rx > > >>>> driver is ixgbe, and Tx driver is i40e. > > >>>> > > >>>> Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > >>>> Suggested-by: Ruifeng Wang <ruifeng.w...@arm.com> > > >>>> Signed-off-by: Feifei Wang <feifei.wa...@arm.com> > > >>>> Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > >>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > >>>> --- > > >>> > > >>> This feature looks very promising for performance. I am pleased to > > >>> see > > >> progress on it. > > >>> > > >> > > >> Hi Morten, > > >> > > >> Yes it brings some performance, but not to generic use case, only to > > >> specific and constraint use case. > > > > > > I got the impression that the supported use case is a prominent and > > > important > > use case. > > > > > > > Can you please give real life samples for this use case, other than just > > showing > > better performance number in the test bench? This helps to understand the > > reasoning better. > The very first patch started off with a constrained but prominent use case. > Though, DPU based PCIe cards running DPDK applications > with 1 or max 2 ports being used in tons of data centers is not a secret > anymore and not a small use case that can be ignored. > However, the design of the patch has changed significantly from then. Now the > solution can be applied to any generic use case that > uses run-to-completion model of DPDK. i.e. the mapping of the RX and TX ports > can be done dynamically in the data plane threads. > There is no need of static configuration from control plane.
+1 to this statement. I think the authors did a good job to make it generic enough, so it can be used for many different cases, plus AFAIU it doesn't introduce new implicit restrictions to the user. Again, this feature is totally optional, so users are free to ignore it. Personally I do not see any good reason why we shouldn't accept this feature into DPDK. Off-course with more code reviews, extra testing, docs updates, etc.. > > On the test bench, we need to make up our mind. When we see improvements, we > say it is just a test bench. On other occasions > when the test bench does not show any improvements (but improvements are > shown by other metrics), we say the test bench does > not show any improvements. > > > > > > This is the primary argument for considering such a complex non-generic > > feature. > I am not sure what is the complexity here, can you please elaborate? > I see other patches/designs (ex: proactive error recovery) which are way more > complex to understand and comprehend. > > > > > > >> > > >> And changes are relatively invasive comparing the usecase it > > >> supports, like it adds new two inline datapath functions and a new > > >> dev_ops. > > >> > > >> I am worried the unnecessary complexity and possible regressions in > > >> the fundamental and simple parts of the project, with a good > > >> intention to gain a few percentage performance in a specific usecase, > > >> can hurt the project. > I agree that we are touching some fundamental parts of the project. But, we > also need to realize that those fundamental parts were > not developed on architectures that have joined the project way later. > Similarly, the use cases have evolved significantly from the > original intended use cases. We cannot hold on to those fundamental designs > if they affect the performance on other architectures > while addressing prominent new use cases. > Please note that this patch does not break any existing features or affect > their performance in any negative way. The generic and > originally intended use cases can benefit from this feature. > > > >> > > >> > > >> I can see this is compared to MBUF_FAST_FREE feature, but > > >> MBUF_FAST_FREE is just an offload benefiting from existing offload > > >> infrastructure, which requires very small update and logically change > > >> in application and simple to implement in the drivers. So, they are > > >> not same from complexity perspective. > > >> > > >> Briefly, I am not comfortable with this change, I would like to see > > >> an explicit approval and code review from techboard to proceed. > > > > > > I agree that the complexity is very high, and thus requires extra > > > consideration. > > Your suggested techboard review and approval process seems like a good > > solution. > We can add to the agenda for the next Techboard meeting. > > > > > > > And the performance benefit of direct rearm should be compared to the > > performance using the new zero-copy mempool API. > > > > > > -Morten > > >