On 1/12/2024 1:19 AM, Chaoyong He wrote: >> On 1/11/2024 2:02 AM, Chaoyong He wrote: >>>> On 1/9/2024 7:56 AM, Chaoyong He wrote: >>>>>> On 12/18/2023 1:50 AM, Chaoyong He wrote: >>>>>>>> On 12/14/2023 10:24 AM, Chaoyong He wrote: >>>>>>>>> From: Long Wu <long...@corigine.com> >>>>>>>>> >>>>>>>>> Set the representor array to NULL to avoid that close interface >>>>>>>>> does not free some resource. >>>>>>>>> >>>>>>>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower >>>>>>>>> firmware") >>>>>>>>> Cc: chaoyong...@corigine.com >>>>>>>>> Cc: sta...@dpdk.org >>>>>>>>> >>>>>>>>> Signed-off-by: Long Wu <long...@corigine.com> >>>>>>>>> Reviewed-by: Chaoyong He <chaoyong...@corigine.com> >>>>>>>>> Reviewed-by: Peng Zhang <peng.zh...@corigine.com> >>>>>>>>> --- >>>>>>>>> drivers/net/nfp/flower/nfp_flower_representor.c | 15 >>>>>>>>> ++++++++++++++- >>>>>>>>> 1 file changed, 14 insertions(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>>>> b/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>>>> index 27ea3891bd..5f7c1fa737 100644 >>>>>>>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void >> *tx_queue, >>>>>>>>> static int nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev) >>>>>>>>> { >>>>>>>>> + uint16_t index; >>>>>>>>> struct nfp_flower_representor *repr; >>>>>>>>> >>>>>>>>> repr = eth_dev->data->dev_private; >>>>>>>>> rte_ring_free(repr->ring); >>>>>>>>> >>>>>>>>> + if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) { >>>>>>>>> + index = >> NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr- >>>>>>>>> port_id); >>>>>>>>> + repr->app_fw_flower->phy_reprs[index] = NULL; >>>>>>>>> + } else { >>>>>>>>> + index = repr->vf_id; >>>>>>>>> + repr->app_fw_flower->vf_reprs[index] = NULL; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> return 0; >>>>>>>>> } >>>>>>>>> >>>>>>>>> static int >>>>>>>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev >>>>>>>>> *eth_dev) >>>>>>>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev) >>>>>>>>> { >>>>>>>>> + struct nfp_flower_representor *repr = >>>>>>>>> +eth_dev->data->dev_private; >>>>>>>>> + >>>>>>>>> + repr->app_fw_flower->pf_repr = NULL; >>>>>>>>> >>>>>>>> >>>>>>>> Here it is assigned to NULL but is it freed? If freed, why not >>>>>>>> set to NULL where it is freed? >>>>>>>> >>>>>>>> Same for above phy_reprs & vf_reprs. >>>>>>> >>>>>>> The whole invoke view: >>>>>>> rte_eth_dev_close() >>>>>>> --> nfp_flower_repr_dev_close() >>>>>>> --> nfp_flower_repr_free() >>>>>>> --> nfp_flower_pf_repr_uninit() >>>>>>> --> nfp_flower_repr_uninit() >>>>>>> // In these two functions, we just assigned to NULL but >>>>>>> not freed >>>> yet. >>>>>>> // It is still refer by the `eth_dev->data->dev_private`. >>>>>>> --> rte_eth_dev_release_port() >>>>>>> --> rte_free(eth_dev->data->dev_private); >>>>>>> // And here it is really freed (by the rte framework). >>>>>>> >>>>>> >>>>>> 'rte_eth_dev_release_port()' frees the device private data, but not >>>>>> all pointers, like 'repr->app_fw_flower->pf_repr', in the struct >>>>>> are freed, it is dev_close() or >>>>>> unint() functions responsibility. >>>>>> >>>>>> Can you please double check if >>>>>> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not? >>>>> >>>>> (gdb) b nfp_flower_repr_dev_close >>>>> Breakpoint 1 at 0x7f839a4ad37f: >>>> file ../drivers/net/nfp/flower/nfp_flower_representor.c, line 356. >>>>> (gdb) c >>>>> Continuing. >>>>> >>>>> Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close >>>> (dev=0x7f839aed2340 <rte_eth_devices>) >>>>> at ../drivers/net/nfp/flower/nfp_flower_representor.c:356 >>>>> 356 if (rte_eal_process_type() != RTE_PROC_PRIMARY) >>>>> (gdb) n >>>>> 359 repr = dev->data->dev_private; >>>>> (gdb) >>>>> 360 app_fw_flower = repr->app_fw_flower; >>>>> (gdb) >>>>> 361 hw = app_fw_flower->pf_hw; >>>>> (gdb) >>>>> 362 pf_dev = hw->pf_dev; >>>>> (gdb) >>>>> 368 nfp_net_disable_queues(dev); >>>>> (gdb) p repr >>>>> $1 = (struct nfp_flower_representor *) 0x17c49c800 >>>>> (gdb) p dev->data->dev_private >>>>> $2 = (void *) 0x17c49c800 >>>>> (gdb) p repr->app_fw_flower->pf_repr >>>>> $3 = (struct nfp_flower_representor *) 0x17c49c800 >>>>> >>>>> As we can see, these three pointers point the same block of memory. >>>>> >>>> >>>> Ahh, I missed that 'repr->app_fw_flower->pf_repr' points to >>>> 'dev_private', so your code makes sense. >>>> >>>> But if it is 'dev_private', why free it in 'nfp_pf_uninit()' as it >>>> will be freed by 'rte_eth_dev_release_port()'? >>> >>> Sorry, I'm not understanding this. >>> The 'dev_private' is a 'struct nfp_flower_representor *', and it will be >>> freed in >> 'rte_eth_dev_release_port()'. >>> What I freed in 'nfp_pf_uninit()' is a 'struct nfp_pf_dev *', so I'm not >>> catch >> your point about this. >>> >>>> Won't removing 'rte_free(pf_dev);' from 'nfp_pf_uninit()' will have >>>> the same effect, instead of setting it NULL in advance? >>>> >>> >>> If I remove the 'rte_free(pf_dev);' from 'nfp_pf_uninit()', there will be a >> memory leak as no one will free it, and actually I'm not 'setting it NULL in >> advance'. >>> >>> 359 repr = dev->data->dev_private; >>> 360 app_fw_flower = repr->app_fw_flower; >>> 361 hw = app_fw_flower->pf_hw; >>> 362 pf_dev = hw->pf_dev; >>> >>> Maybe you just confuse the 'pf_repr' and 'pf_dev'? Just a guess. >>> >> >> Yes I did confuse those two, sorry about that. >> >> 'repr->app_fw_flower->pf_repr' is 'dev_private', and I assumed you are >> setting >> it NULL to escape from double free (and was checking where that double free >> happens), but I guess that is not the case. >> >> 'rte_eth_dev_destroy()' calls 'rte_eth_dev_release_port()' and frees >> 'dev_private' but 'repr->app_fw_flower->pf_repr' remains as dangling pointer >> and perhaps prevents 'nfp_flower_repr_dev_close()' move forward (because >> of "if (app_fw_flower->pf_repr != NULL)" check), and you are fixing it, is >> it the >> case? > > Correct, that's what we want to do by this patch and where the problem is, > your description is very clear and brief. >
Got it, I will proceed with the set. More details in the commit log helps reviewing the patches.