On 2020/5/8 21:01, Andrew Lunn wrote:
> On Fri, May 08, 2020 at 01:08:13PM +0530, Sunil Kovvuri wrote:
>> On Fri, May 8, 2020 at 11:00 AM Kevin Hao <haoke...@gmail.com> wrote:
>>>
>>> On Fri, May 08, 2020 at 10:18:27AM +0530, Sunil Kovvuri wrote:
>>>> On Fri, May 8, 2020 at 9:43 AM Kevin Hao <haoke...@gmail.com> wrote:
>>>>>
>>>>> In the current codes, the octeontx2 uses its own method to allocate
>>>>> the pool buffers, but there are some issues in this implementation.
>>>>> 1. We have to run the otx2_get_page() for each allocation cycle and
>>>>>    this is pretty error prone. As I can see there is no invocation
>>>>>    of the otx2_get_page() in otx2_pool_refill_task(), this will leave
>>>>>    the allocated pages have the wrong refcount and may be freed wrongly.
>>>>
>>>> Thanks for pointing, will fix.
>>>>
>>>>> 2. It wastes memory. For example, if we only receive one packet in a
>>>>>    NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf()
>>>>>    to refill the pool buffers and leave the remain area of the allocated
>>>>>    page wasted. On a kernel with 64K page, 62K area is wasted.
>>>>>
>>>>> IMHO it is really unnecessary to implement our own method for the
>>>>> buffers allocate, we can reuse the napi_alloc_frag() to simplify
>>>>> our code.
>>>>>
>>>>> Signed-off-by: Kevin Hao <haoke...@gmail.com>
>>>>
>>>> Have you measured performance with and without your patch ?
>>>
>>> I will do performance compare later. But I don't think there will be 
>>> measurable
>>> difference.
>>>
>>>> I didn't use napi_alloc_frag() as it's too costly, if in one NAPI
>>>> instance driver
>>>> receives 32 pkts, then 32 calls to napi_alloc_frag() and updates to page 
>>>> ref
>>>> count per fragment etc are costly.
>>>
>>> No, the page ref only be updated at the page allocation and all the space 
>>> are
>>> used. In general, the invocation of napi_alloc_frag() will not cause the 
>>> update
>>> of the page ref. So in theory, the count of updating page ref should be 
>>> reduced
>>> by using of napi_alloc_frag() compare to the current otx2 implementation.
>>>
>>
>> Okay, it seems i misunderstood it.
> 
> Hi Sunil
> 
> In general, you should not work around issues in the core, you should
> improve the core. If your implementation really was more efficient
> than the core code, it would of been better if you proposed fixes to
> the core, not hide away better code in your own driver.

Hi, Andrew

When looking the napi_alloc_frag() api, the mapping/unmapping is done by
caller, if the mapping/unmapping is managed in the core, then the
mapping/unmapping can be avoided when the page is reused, because the
mapping/unmapping operation is costly when IOMMU is on, do you think it
makes sense to do the mapping/ummapping in the page_frag_*()?

> 
>       Andrew
> .
> 

Reply via email to