On 2020/5/8 21:01, Andrew Lunn wrote: > On Fri, May 08, 2020 at 01:08:13PM +0530, Sunil Kovvuri wrote: >> On Fri, May 8, 2020 at 11:00 AM Kevin Hao <haoke...@gmail.com> wrote: >>> >>> On Fri, May 08, 2020 at 10:18:27AM +0530, Sunil Kovvuri wrote: >>>> On Fri, May 8, 2020 at 9:43 AM Kevin Hao <haoke...@gmail.com> wrote: >>>>> >>>>> In the current codes, the octeontx2 uses its own method to allocate >>>>> the pool buffers, but there are some issues in this implementation. >>>>> 1. We have to run the otx2_get_page() for each allocation cycle and >>>>> this is pretty error prone. As I can see there is no invocation >>>>> of the otx2_get_page() in otx2_pool_refill_task(), this will leave >>>>> the allocated pages have the wrong refcount and may be freed wrongly. >>>> >>>> Thanks for pointing, will fix. >>>> >>>>> 2. It wastes memory. For example, if we only receive one packet in a >>>>> NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf() >>>>> to refill the pool buffers and leave the remain area of the allocated >>>>> page wasted. On a kernel with 64K page, 62K area is wasted. >>>>> >>>>> IMHO it is really unnecessary to implement our own method for the >>>>> buffers allocate, we can reuse the napi_alloc_frag() to simplify >>>>> our code. >>>>> >>>>> Signed-off-by: Kevin Hao <haoke...@gmail.com> >>>> >>>> Have you measured performance with and without your patch ? >>> >>> I will do performance compare later. But I don't think there will be >>> measurable >>> difference. >>> >>>> I didn't use napi_alloc_frag() as it's too costly, if in one NAPI >>>> instance driver >>>> receives 32 pkts, then 32 calls to napi_alloc_frag() and updates to page >>>> ref >>>> count per fragment etc are costly. >>> >>> No, the page ref only be updated at the page allocation and all the space >>> are >>> used. In general, the invocation of napi_alloc_frag() will not cause the >>> update >>> of the page ref. So in theory, the count of updating page ref should be >>> reduced >>> by using of napi_alloc_frag() compare to the current otx2 implementation. >>> >> >> Okay, it seems i misunderstood it. > > Hi Sunil > > In general, you should not work around issues in the core, you should > improve the core. If your implementation really was more efficient > than the core code, it would of been better if you proposed fixes to > the core, not hide away better code in your own driver.
Hi, Andrew When looking the napi_alloc_frag() api, the mapping/unmapping is done by caller, if the mapping/unmapping is managed in the core, then the mapping/unmapping can be avoided when the page is reused, because the mapping/unmapping operation is costly when IOMMU is on, do you think it makes sense to do the mapping/ummapping in the page_frag_*()? > > Andrew > . >