On 1/11/2024 2:02 AM, Chaoyong He wrote: >> On 1/9/2024 7:56 AM, Chaoyong He wrote: >>>> On 12/18/2023 1:50 AM, Chaoyong He wrote: >>>>>> On 12/14/2023 10:24 AM, Chaoyong He wrote: >>>>>>> From: Long Wu <long...@corigine.com> >>>>>>> >>>>>>> Set the representor array to NULL to avoid that close interface >>>>>>> does not free some resource. >>>>>>> >>>>>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower >>>>>>> firmware") >>>>>>> Cc: chaoyong...@corigine.com >>>>>>> Cc: sta...@dpdk.org >>>>>>> >>>>>>> Signed-off-by: Long Wu <long...@corigine.com> >>>>>>> Reviewed-by: Chaoyong He <chaoyong...@corigine.com> >>>>>>> Reviewed-by: Peng Zhang <peng.zh...@corigine.com> >>>>>>> --- >>>>>>> drivers/net/nfp/flower/nfp_flower_representor.c | 15 >>>>>>> ++++++++++++++- >>>>>>> 1 file changed, 14 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>> b/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>> index 27ea3891bd..5f7c1fa737 100644 >>>>>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c >>>>>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void *tx_queue, >>>>>>> static int nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev) { >>>>>>> + uint16_t index; >>>>>>> struct nfp_flower_representor *repr; >>>>>>> >>>>>>> repr = eth_dev->data->dev_private; >>>>>>> rte_ring_free(repr->ring); >>>>>>> >>>>>>> + if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) { >>>>>>> + index = NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr- >>>>>>> port_id); >>>>>>> + repr->app_fw_flower->phy_reprs[index] = NULL; >>>>>>> + } else { >>>>>>> + index = repr->vf_id; >>>>>>> + repr->app_fw_flower->vf_reprs[index] = NULL; >>>>>>> + } >>>>>>> + >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> static int >>>>>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev >>>>>>> *eth_dev) >>>>>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev) >>>>>>> { >>>>>>> + struct nfp_flower_representor *repr = >>>>>>> +eth_dev->data->dev_private; >>>>>>> + >>>>>>> + repr->app_fw_flower->pf_repr = NULL; >>>>>>> >>>>>> >>>>>> Here it is assigned to NULL but is it freed? If freed, why not set >>>>>> to NULL where it is freed? >>>>>> >>>>>> Same for above phy_reprs & vf_reprs. >>>>> >>>>> The whole invoke view: >>>>> rte_eth_dev_close() >>>>> --> nfp_flower_repr_dev_close() >>>>> --> nfp_flower_repr_free() >>>>> --> nfp_flower_pf_repr_uninit() >>>>> --> nfp_flower_repr_uninit() >>>>> // In these two functions, we just assigned to NULL but not >>>>> freed >> yet. >>>>> // It is still refer by the `eth_dev->data->dev_private`. >>>>> --> rte_eth_dev_release_port() >>>>> --> rte_free(eth_dev->data->dev_private); >>>>> // And here it is really freed (by the rte framework). >>>>> >>>> >>>> 'rte_eth_dev_release_port()' frees the device private data, but not >>>> all pointers, like 'repr->app_fw_flower->pf_repr', in the struct are >>>> freed, it is dev_close() or >>>> unint() functions responsibility. >>>> >>>> Can you please double check if >>>> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not? >>> >>> (gdb) b nfp_flower_repr_dev_close >>> Breakpoint 1 at 0x7f839a4ad37f: >> file ../drivers/net/nfp/flower/nfp_flower_representor.c, line 356. >>> (gdb) c >>> Continuing. >>> >>> Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close >> (dev=0x7f839aed2340 <rte_eth_devices>) >>> at ../drivers/net/nfp/flower/nfp_flower_representor.c:356 >>> 356 if (rte_eal_process_type() != RTE_PROC_PRIMARY) >>> (gdb) n >>> 359 repr = dev->data->dev_private; >>> (gdb) >>> 360 app_fw_flower = repr->app_fw_flower; >>> (gdb) >>> 361 hw = app_fw_flower->pf_hw; >>> (gdb) >>> 362 pf_dev = hw->pf_dev; >>> (gdb) >>> 368 nfp_net_disable_queues(dev); >>> (gdb) p repr >>> $1 = (struct nfp_flower_representor *) 0x17c49c800 >>> (gdb) p dev->data->dev_private >>> $2 = (void *) 0x17c49c800 >>> (gdb) p repr->app_fw_flower->pf_repr >>> $3 = (struct nfp_flower_representor *) 0x17c49c800 >>> >>> As we can see, these three pointers point the same block of memory. >>> >> >> Ahh, I missed that 'repr->app_fw_flower->pf_repr' points to 'dev_private', so >> your code makes sense. >> >> But if it is 'dev_private', why free it in 'nfp_pf_uninit()' as it will be >> freed by >> 'rte_eth_dev_release_port()'? > > Sorry, I'm not understanding this. > The 'dev_private' is a 'struct nfp_flower_representor *', and it will be > freed in 'rte_eth_dev_release_port()'. > What I freed in 'nfp_pf_uninit()' is a 'struct nfp_pf_dev *', so I'm not > catch your point about this. > >> Won't removing 'rte_free(pf_dev);' from 'nfp_pf_uninit()' will have the same >> effect, instead of setting it NULL in advance? >> > > If I remove the 'rte_free(pf_dev);' from 'nfp_pf_uninit()', there will be a > memory leak as no one will free it, and actually I'm not 'setting it NULL in > advance'. > > 359 repr = dev->data->dev_private; > 360 app_fw_flower = repr->app_fw_flower; > 361 hw = app_fw_flower->pf_hw; > 362 pf_dev = hw->pf_dev; > > Maybe you just confuse the 'pf_repr' and 'pf_dev'? Just a guess. >
Yes I did confuse those two, sorry about that. 'repr->app_fw_flower->pf_repr' is 'dev_private', and I assumed you are setting it NULL to escape from double free (and was checking where that double free happens), but I guess that is not the case. 'rte_eth_dev_destroy()' calls 'rte_eth_dev_release_port()' and frees 'dev_private' but 'repr->app_fw_flower->pf_repr' remains as dangling pointer and perhaps prevents 'nfp_flower_repr_dev_close()' move forward (because of "if (app_fw_flower->pf_repr != NULL)" check), and you are fixing it, is it the case?