On 1/12/2024 1:19 AM, Chaoyong He wrote:
>> On 1/11/2024 2:02 AM, Chaoyong He wrote:
>>>> On 1/9/2024 7:56 AM, Chaoyong He wrote:
>>>>>> On 12/18/2023 1:50 AM, Chaoyong He wrote:
>>>>>>>> On 12/14/2023 10:24 AM, Chaoyong He wrote:
>>>>>>>>> From: Long Wu <long...@corigine.com>
>>>>>>>>>
>>>>>>>>> Set the representor array to NULL to avoid that close interface
>>>>>>>>> does not free some resource.
>>>>>>>>>
>>>>>>>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower
>>>>>>>>> firmware")
>>>>>>>>> Cc: chaoyong...@corigine.com
>>>>>>>>> Cc: sta...@dpdk.org
>>>>>>>>>
>>>>>>>>> Signed-off-by: Long Wu <long...@corigine.com>
>>>>>>>>> Reviewed-by: Chaoyong He <chaoyong...@corigine.com>
>>>>>>>>> Reviewed-by: Peng Zhang <peng.zh...@corigine.com>
>>>>>>>>> ---
>>>>>>>>>  drivers/net/nfp/flower/nfp_flower_representor.c | 15
>>>>>>>>> ++++++++++++++-
>>>>>>>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>>>> b/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>>>> index 27ea3891bd..5f7c1fa737 100644
>>>>>>>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
>>>>>>>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void
>> *tx_queue,
>>>>>>>>> static int  nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev)
>>>>>>>>> {
>>>>>>>>> +     uint16_t index;
>>>>>>>>>       struct nfp_flower_representor *repr;
>>>>>>>>>
>>>>>>>>>       repr = eth_dev->data->dev_private;
>>>>>>>>>       rte_ring_free(repr->ring);
>>>>>>>>>
>>>>>>>>> +     if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
>>>>>>>>> +             index =
>> NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr-
>>>>>>>>> port_id);
>>>>>>>>> +             repr->app_fw_flower->phy_reprs[index] = NULL;
>>>>>>>>> +     } else {
>>>>>>>>> +             index = repr->vf_id;
>>>>>>>>> +             repr->app_fw_flower->vf_reprs[index] = NULL;
>>>>>>>>> +     }
>>>>>>>>> +
>>>>>>>>>       return 0;
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  static int
>>>>>>>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev
>>>>>>>>> *eth_dev)
>>>>>>>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev)
>>>>>>>>>  {
>>>>>>>>> +     struct nfp_flower_representor *repr =
>>>>>>>>> +eth_dev->data->dev_private;
>>>>>>>>> +
>>>>>>>>> +     repr->app_fw_flower->pf_repr = NULL;
>>>>>>>>>
>>>>>>>>
>>>>>>>> Here it is assigned to NULL but is it freed? If freed, why not
>>>>>>>> set to NULL where it is freed?
>>>>>>>>
>>>>>>>> Same for above phy_reprs & vf_reprs.
>>>>>>>
>>>>>>> The whole invoke view:
>>>>>>> rte_eth_dev_close()
>>>>>>>     --> nfp_flower_repr_dev_close()
>>>>>>>         --> nfp_flower_repr_free()
>>>>>>>             --> nfp_flower_pf_repr_uninit()
>>>>>>>             --> nfp_flower_repr_uninit()
>>>>>>>            // In these two functions, we just assigned to NULL but
>>>>>>> not freed
>>>> yet.
>>>>>>>            // It is still refer by the `eth_dev->data->dev_private`.
>>>>>>>     --> rte_eth_dev_release_port()
>>>>>>>         --> rte_free(eth_dev->data->dev_private);
>>>>>>>         // And here it is really freed (by the rte framework).
>>>>>>>
>>>>>>
>>>>>> 'rte_eth_dev_release_port()' frees the device private data, but not
>>>>>> all pointers, like 'repr->app_fw_flower->pf_repr', in the struct
>>>>>> are freed, it is dev_close() or
>>>>>> unint() functions responsibility.
>>>>>>
>>>>>> Can you please double check if
>>>>>> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not?
>>>>>
>>>>> (gdb) b nfp_flower_repr_dev_close
>>>>> Breakpoint 1 at 0x7f839a4ad37f:
>>>> file ../drivers/net/nfp/flower/nfp_flower_representor.c, line 356.
>>>>> (gdb) c
>>>>> Continuing.
>>>>>
>>>>> Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close
>>>> (dev=0x7f839aed2340 <rte_eth_devices>)
>>>>>     at ../drivers/net/nfp/flower/nfp_flower_representor.c:356
>>>>> 356             if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>>>>> (gdb) n
>>>>> 359             repr = dev->data->dev_private;
>>>>> (gdb)
>>>>> 360             app_fw_flower = repr->app_fw_flower;
>>>>> (gdb)
>>>>> 361             hw = app_fw_flower->pf_hw;
>>>>> (gdb)
>>>>> 362             pf_dev = hw->pf_dev;
>>>>> (gdb)
>>>>> 368             nfp_net_disable_queues(dev);
>>>>> (gdb) p repr
>>>>> $1 = (struct nfp_flower_representor *) 0x17c49c800
>>>>> (gdb) p dev->data->dev_private
>>>>> $2 = (void *) 0x17c49c800
>>>>> (gdb) p repr->app_fw_flower->pf_repr
>>>>> $3 = (struct nfp_flower_representor *) 0x17c49c800
>>>>>
>>>>> As we can see, these three pointers point the same block of memory.
>>>>>
>>>>
>>>> Ahh, I missed that 'repr->app_fw_flower->pf_repr' points to
>>>> 'dev_private', so your code makes sense.
>>>>
>>>> But if it is 'dev_private', why free it in 'nfp_pf_uninit()' as it
>>>> will be freed by 'rte_eth_dev_release_port()'?
>>>
>>> Sorry, I'm not understanding this.
>>> The 'dev_private' is a 'struct nfp_flower_representor *', and it will be 
>>> freed in
>> 'rte_eth_dev_release_port()'.
>>> What I freed in 'nfp_pf_uninit()' is a 'struct nfp_pf_dev *', so I'm not 
>>> catch
>> your point about this.
>>>
>>>> Won't removing 'rte_free(pf_dev);' from 'nfp_pf_uninit()' will have
>>>> the same effect, instead of setting it NULL in advance?
>>>>
>>>
>>> If I remove the 'rte_free(pf_dev);' from 'nfp_pf_uninit()', there will be a
>> memory leak as no one will free it, and actually I'm not 'setting it NULL in
>> advance'.
>>>
>>> 359             repr = dev->data->dev_private;
>>> 360             app_fw_flower = repr->app_fw_flower;
>>> 361             hw = app_fw_flower->pf_hw;
>>> 362             pf_dev = hw->pf_dev;
>>>
>>> Maybe you just confuse the 'pf_repr' and 'pf_dev'? Just a guess.
>>>
>>
>> Yes I did confuse those two, sorry about that.
>>
>> 'repr->app_fw_flower->pf_repr' is 'dev_private', and I assumed you are 
>> setting
>> it NULL to escape from double free (and was checking where that double free
>> happens), but I guess that is not the case.
>>
>> 'rte_eth_dev_destroy()' calls 'rte_eth_dev_release_port()' and frees
>> 'dev_private' but 'repr->app_fw_flower->pf_repr' remains as dangling pointer
>> and perhaps prevents 'nfp_flower_repr_dev_close()' move forward (because
>> of "if (app_fw_flower->pf_repr != NULL)" check), and you are fixing it, is 
>> it the
>> case?
> 
> Correct, that's what we want to do by this patch and where the problem is, 
> your description is very clear and brief.
> 

Got it, I will proceed with the set.
More details in the commit log helps reviewing the patches.

Reply via email to