> On 1/11/2024 2:02 AM, Chaoyong He wrote: > >> On 1/9/2024 7:56 AM, Chaoyong He wrote: > >>>> On 12/18/2023 1:50 AM, Chaoyong He wrote: > >>>>>> On 12/14/2023 10:24 AM, Chaoyong He wrote: > >>>>>>> From: Long Wu <long...@corigine.com> > >>>>>>> > >>>>>>> Set the representor array to NULL to avoid that close interface > >>>>>>> does not free some resource. > >>>>>>> > >>>>>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower > >>>>>>> firmware") > >>>>>>> Cc: chaoyong...@corigine.com > >>>>>>> Cc: sta...@dpdk.org > >>>>>>> > >>>>>>> Signed-off-by: Long Wu <long...@corigine.com> > >>>>>>> Reviewed-by: Chaoyong He <chaoyong...@corigine.com> > >>>>>>> Reviewed-by: Peng Zhang <peng.zh...@corigine.com> > >>>>>>> --- > >>>>>>> drivers/net/nfp/flower/nfp_flower_representor.c | 15 > >>>>>>> ++++++++++++++- > >>>>>>> 1 file changed, 14 insertions(+), 1 deletion(-) > >>>>>>> > >>>>>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c > >>>>>>> b/drivers/net/nfp/flower/nfp_flower_representor.c > >>>>>>> index 27ea3891bd..5f7c1fa737 100644 > >>>>>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c > >>>>>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c > >>>>>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void > *tx_queue, > >>>>>>> static int nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev) > >>>>>>> { > >>>>>>> + uint16_t index; > >>>>>>> struct nfp_flower_representor *repr; > >>>>>>> > >>>>>>> repr = eth_dev->data->dev_private; > >>>>>>> rte_ring_free(repr->ring); > >>>>>>> > >>>>>>> + if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) { > >>>>>>> + index = > NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr- > >>>>>>> port_id); > >>>>>>> + repr->app_fw_flower->phy_reprs[index] = NULL; > >>>>>>> + } else { > >>>>>>> + index = repr->vf_id; > >>>>>>> + repr->app_fw_flower->vf_reprs[index] = NULL; > >>>>>>> + } > >>>>>>> + > >>>>>>> return 0; > >>>>>>> } > >>>>>>> > >>>>>>> static int > >>>>>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev > >>>>>>> *eth_dev) > >>>>>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev) > >>>>>>> { > >>>>>>> + struct nfp_flower_representor *repr = > >>>>>>> +eth_dev->data->dev_private; > >>>>>>> + > >>>>>>> + repr->app_fw_flower->pf_repr = NULL; > >>>>>>> > >>>>>> > >>>>>> Here it is assigned to NULL but is it freed? If freed, why not > >>>>>> set to NULL where it is freed? > >>>>>> > >>>>>> Same for above phy_reprs & vf_reprs. > >>>>> > >>>>> The whole invoke view: > >>>>> rte_eth_dev_close() > >>>>> --> nfp_flower_repr_dev_close() > >>>>> --> nfp_flower_repr_free() > >>>>> --> nfp_flower_pf_repr_uninit() > >>>>> --> nfp_flower_repr_uninit() > >>>>> // In these two functions, we just assigned to NULL but > >>>>> not freed > >> yet. > >>>>> // It is still refer by the `eth_dev->data->dev_private`. > >>>>> --> rte_eth_dev_release_port() > >>>>> --> rte_free(eth_dev->data->dev_private); > >>>>> // And here it is really freed (by the rte framework). > >>>>> > >>>> > >>>> 'rte_eth_dev_release_port()' frees the device private data, but not > >>>> all pointers, like 'repr->app_fw_flower->pf_repr', in the struct > >>>> are freed, it is dev_close() or > >>>> unint() functions responsibility. > >>>> > >>>> Can you please double check if > >>>> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not? > >>> > >>> (gdb) b nfp_flower_repr_dev_close > >>> Breakpoint 1 at 0x7f839a4ad37f: > >> file ../drivers/net/nfp/flower/nfp_flower_representor.c, line 356. > >>> (gdb) c > >>> Continuing. > >>> > >>> Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close > >> (dev=0x7f839aed2340 <rte_eth_devices>) > >>> at ../drivers/net/nfp/flower/nfp_flower_representor.c:356 > >>> 356 if (rte_eal_process_type() != RTE_PROC_PRIMARY) > >>> (gdb) n > >>> 359 repr = dev->data->dev_private; > >>> (gdb) > >>> 360 app_fw_flower = repr->app_fw_flower; > >>> (gdb) > >>> 361 hw = app_fw_flower->pf_hw; > >>> (gdb) > >>> 362 pf_dev = hw->pf_dev; > >>> (gdb) > >>> 368 nfp_net_disable_queues(dev); > >>> (gdb) p repr > >>> $1 = (struct nfp_flower_representor *) 0x17c49c800 > >>> (gdb) p dev->data->dev_private > >>> $2 = (void *) 0x17c49c800 > >>> (gdb) p repr->app_fw_flower->pf_repr > >>> $3 = (struct nfp_flower_representor *) 0x17c49c800 > >>> > >>> As we can see, these three pointers point the same block of memory. > >>> > >> > >> Ahh, I missed that 'repr->app_fw_flower->pf_repr' points to > >> 'dev_private', so your code makes sense. > >> > >> But if it is 'dev_private', why free it in 'nfp_pf_uninit()' as it > >> will be freed by 'rte_eth_dev_release_port()'? > > > > Sorry, I'm not understanding this. > > The 'dev_private' is a 'struct nfp_flower_representor *', and it will be > > freed in > 'rte_eth_dev_release_port()'. > > What I freed in 'nfp_pf_uninit()' is a 'struct nfp_pf_dev *', so I'm not > > catch > your point about this. > > > >> Won't removing 'rte_free(pf_dev);' from 'nfp_pf_uninit()' will have > >> the same effect, instead of setting it NULL in advance? > >> > > > > If I remove the 'rte_free(pf_dev);' from 'nfp_pf_uninit()', there will be a > memory leak as no one will free it, and actually I'm not 'setting it NULL in > advance'. > > > > 359 repr = dev->data->dev_private; > > 360 app_fw_flower = repr->app_fw_flower; > > 361 hw = app_fw_flower->pf_hw; > > 362 pf_dev = hw->pf_dev; > > > > Maybe you just confuse the 'pf_repr' and 'pf_dev'? Just a guess. > > > > Yes I did confuse those two, sorry about that. > > 'repr->app_fw_flower->pf_repr' is 'dev_private', and I assumed you are setting > it NULL to escape from double free (and was checking where that double free > happens), but I guess that is not the case. > > 'rte_eth_dev_destroy()' calls 'rte_eth_dev_release_port()' and frees > 'dev_private' but 'repr->app_fw_flower->pf_repr' remains as dangling pointer > and perhaps prevents 'nfp_flower_repr_dev_close()' move forward (because > of "if (app_fw_flower->pf_repr != NULL)" check), and you are fixing it, is it > the > case?
Correct, that's what we want to do by this patch and where the problem is, your description is very clear and brief.