> -----Original Message----- > From: Wu, WenxuanX > Sent: 2022年3月9日 11:07 > To: Yigit, Ferruh <ferruh.yi...@intel.com>; Li, Xiaoyun > <xiaoyun...@intel.com>; > dev@dpdk.org > Cc: sta...@dpdk.org > Subject: RE: [PATCH v2 2/2] app/testpmd:fix testpmd quit failure > > > > > -----Original Message----- > > From: Yigit, Ferruh <ferruh.yi...@intel.com> > > Sent: 2022年3月5日 0:16 > > To: Wu, WenxuanX <wenxuanx...@intel.com>; Li, Xiaoyun > > <xiaoyun...@intel.com>; dev@dpdk.org > > Cc: sta...@dpdk.org > > Subject: Re: [PATCH v2 2/2] app/testpmd:fix testpmd quit failure > > > > On 3/3/2022 1:22 PM, Wu, WenxuanX wrote: > > > > moved down, please don't top post > > > > >> -----Original Message----- > > >> From: Wu, WenxuanX <wenxuanx...@intel.com> > > >> Sent: 2022年2月23日 19:33 > > >> To: Li, Xiaoyun <xiaoyun...@intel.com>; Yigit, Ferruh > > >> <ferruh.yi...@intel.com>; dev@dpdk.org > > >> Cc: Wu, WenxuanX <wenxuanx...@intel.com>; sta...@dpdk.org > > >> Subject: [PATCH v2 2/2] app/testpmd:fix testpmd quit failure > > >> > > >> From: wenxuan wu <wenxuanx...@intel.com> > > >> > > >> When testpmd start ed with 1 pf and 2 vfs, testpmd quited while vfs > > >> were still alive would result in failure. Root cause is that pf had > > >> been released already but vfs were still accessing by func > > >> rte_eth_dev_info_get, which would result in heap-free-after-use error. > > >> > > >> By quitting our ports in reverse order to avoid this.And the order > > >> is guaranteed that vf are created after pfs. > > >> > > >> Fixes: d3a274ce9dee ("app/testpmd: handle SIGINT and SIGTERM") > > >> Cc: sta...@dpdk.org > > >> > > >> Signed-off-by: wenxuan wu <wenxuanx...@intel.com> > > >> --- > > >> app/test-pmd/testpmd.c | 4 ++-- > > >> 1 file changed, 2 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > > >> e1da961311..698b6d8cc4 100644 > > >> --- a/app/test-pmd/testpmd.c > > >> +++ b/app/test-pmd/testpmd.c > > >> @@ -3384,12 +3384,12 @@ pmd_test_exit(void) #endif > > >> if (ports != NULL) { > > >> no_link_check = 1; > > >> - RTE_ETH_FOREACH_DEV(pt_id) { > > >> + RTE_ETH_FOREACH_DEV_REVERSE(pt_id) { > > >> printf("\nStopping port %d...\n", pt_id); > > >> fflush(stdout); > > >> stop_port(pt_id); > > >> } > > >> - RTE_ETH_FOREACH_DEV(pt_id) { > > >> + RTE_ETH_FOREACH_DEV_REVERSE(pt_id) { > > >> printf("\nShutting down port %d...\n", pt_id); > > >> fflush(stdout); > > >> close_port(pt_id); > > >> -- > > >> 2.25.1 > > > > > > > > > I found this meaning in DPDK testplan. > > > Note that currently hot-plugging of representor ports is not > > > supported so all > > the required representors must be specified on the creation of the PF > > or the trusted VF. > > > When testpmd is started with pf and vf representors, the order of > > representor is determined on creation. So it is guaranteed that ,pf is > > beneath the vf representors, we implemented in a reverse way is > > acceptable just at present, depends on when the hot-plugging of > representor is supported. > > > > > > > The patch mentions from PF and VFs, and now you are referring to port > > representor. > > Is the problem related to VF or port representor. > > > > For both, VF and port reporesentor should be closed before PF, that > > part is OK. My comment is if reversing port id traverse will fix the > > issue or do we need more complex solution. > > Like have APIs to get VF and representor ports from a given port id, > > and free them first. The problem is that when I attempted to use a func like get_representor_info(pid,info); I didn't found this func implemented by driver , so I can not get the type of port(VF or PF ) directly by get_representor_info(pid,info),especially when the connected driver is i40e, representor_info_get occurred below.
./drivers/net/mlx5/mlx5.c: .representor_info_get = mlx5_representor_info_get, ./drivers/net/mlx5/mlx5.c: .representor_info_get = mlx5_representor_info_get, ./drivers/net/mlx5/mlx5.c: .representor_info_get = mlx5_representor_info_get, ./drivers/net/mlx5/mlx5.h:int mlx5_representor_info_get(struct rte_eth_dev *dev, ./drivers/net/mlx5/mlx5_ethdev.c:mlx5_representor_info_get(struct rte_eth_dev *dev, ./drivers/net/sfc/sfc_ethdev.c:sfc_representor_info_get(struct rte_eth_dev *dev, ./drivers/net/sfc/sfc_ethdev.c: .representor_info_get = sfc_representor_info_get, ./app/test-pmd/cmdline.c: ret = rte_eth_representor_info_get(res->cmd_pid, NULL); ./app/test-pmd/cmdline.c: ret = rte_eth_representor_info_get(res->cmd_pid, info); ./app/test-pmd/testpmd.c: ret = rte_eth_representor_info_get(pi,&info); ./lib/ethdev/version.map: rte_eth_representor_info_get; ./lib/ethdev/rte_ethdev.h:int rte_eth_representor_info_get(uint16_t port_id, ./lib/ethdev/ethdev_driver.h:typedef int (*eth_representor_info_get_t)(struct rte_eth_dev *dev, ./lib/ethdev/ethdev_driver.h: eth_representor_info_get_t representor_info_get; ./lib/ethdev/ethdev_driver.h: * The mapping is retrieved from rte_eth_representor_info_get(). ./lib/ethdev/rte_ethdev.c: ret = rte_eth_representor_info_get(port_id, NULL); ./lib/ethdev/rte_ethdev.c: ret = rte_eth_representor_info_get(port_id, info); ./lib/ethdev/rte_ethdev.c:rte_eth_representor_info_get(uint16_t port_id, ./lib/ethdev/rte_ethdev.c: RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->representor_info_get, -ENOTSUP); ./lib/ethdev/rte_ethdev.c: return eth_err(port_id, (*dev->dev_ops->representor_info_get)(dev, info)); We should focus on changing the logic of is_bonding_slave to avoid this bug ,right ? The procedure of port_close is like this : FOR_EACH_DEV(pi): If Is_bonding_slave(pi) Continue; .... Etc. In is_bonding_slave(pi): This func get_dev_info(pid,dev_info) would be called to get dev.dev_flag &SLAVE ,this would be result in error ,when port sequence is like port 0 PF ,port 1 VF_REPR, port 2 VF_REPR, there is no obstacles in closing port 0 ,then pid iterate to port 1. But port 1 is a VF_REPR which based on port 0 ,when in func is_bonding_slave(port 1), it would call get_dev_info(1,dev_info) ,error occurred. Because we use get_dev_info(especially when PF is released ) not ports[1] which had been pre allocated . would result in this error. Two solutions : 1. Reverse order to close like I mentioned before (PF and VF_repr order is determined at creation time with no representor hot_plug). 2.change func get_dev_info(pid,info) to ports[pid) to get dev_flag Both can resolve this bug ,which one do you prefer?