On Tue, Dec 12, 2023 at 04:16:20PM +0100, Morten Brørup wrote: > +TO: Bruce, please stop me if I'm completely off track here. > > > From: Ferruh Yigit [mailto:ferruh.yi...@amd.com] Sent: Tuesday, 12 > > December 2023 15.38 > > > > On 12/12/2023 11:40 AM, Morten Brørup wrote: > > >> From: Vipin Varghese [mailto:vipin.vargh...@amd.com] Sent: Tuesday, > > >> 12 December 2023 11.38 > > >> > > >> Replace pktmbuf pool with mempool, this allows increase in MOPS > > >> especially in lower buffer size. Using Mempool, allows to reduce the > > >> extra CPU cycles. > > > > > > I get the point of this change: It tests the performance of copying > > raw memory objects using respectively rte_memcpy and DMA, without the > > mbuf indirection overhead. > > > > > > However, I still consider the existing test relevant: The performance > > of copying packets using respectively rte_memcpy and DMA. > > > > > > > This is DMA performance test application and packets are not used, > > using pktmbuf just introduces overhead to the main focus of the > > application. > > > > I am not sure if pktmuf selected intentionally for this test > > application, but I assume it is there because of historical reasons. > > I think pktmbuf was selected intentionally, to provide more accurate > results for application developers trying to determine when to use > rte_memcpy and when to use DMA. Much like the "copy breakpoint" in Linux > Ethernet drivers is used to determine which code path to take for each > received packet. > > Most applications will be working with pktmbufs, so these applications > will also experience the pktmbuf overhead. Performance testing with the > same overhead as the application will be better to help the application > developer determine when to use rte_memcpy and when to use DMA when > working with pktmbufs. > > (Furthermore, for the pktmbuf tests, I wonder if copying performance > could also depend on IOVA mode and RTE_IOVA_IN_MBUF.) > > Nonetheless, there may also be use cases where raw mempool objects are > being copied by rte_memcpy or DMA, so adding tests for these use cases > are useful. > > > @Bruce, you were also deeply involved in the DMA library, and probably > have more up-to-date practical experience with it. Am I right that > pktmbuf overhead in these tests provides more "real life use"-like > results? Or am I completely off track with my thinking here, i.e. the > pktmbuf overhead is only noise? > I'm actually not that familiar with the dma-test application, so can't comment on the specific overhead involved here. In the general case, if we are just talking about the overhead of dereferencing the mbufs then I would expect the overhead to be negligible. However, if we are looking to include the cost of allocation and freeing of buffers, I'd try to avoid that as it is a cost that would have to be paid for both SW copies and HW copies, so should not count when calculating offload cost.
/Bruce