Hello Gaetan,
On Tue, Feb 4, 2020 at 3:19 PM Gaetan Rivet <gr...@u256.net> wrote: > > On 04/02/2020 10:15, Somnath Kotur wrote: > > As per the comments in this code section, "since there is a matching device, > > it is now its responsibility to manage the devargs we've just inserted." > > But the matching device ptr's devargs is still uninitialized or not pointing > > to the newest dev_args that were passed as a parameter to local_dev_probe(). > > This is needed particularly in the case when *probe is called again* on an > > already probed device(the parent device for the representor) as part of > > adding > > a representor port to an OVS switch(OVS-DPDK) like so: > > ovs-vsctl add-port ovsbr0 vfrep1 -- set Interface vfrep1 type=dpdk \ > > options:dpdk-devargs=0000:06:02.0,representor=[1] > > > > Fixes: 7e8b26650146 ("eal: fix hotplug add / remove") > > > > Signed-off-by: Somnath Kotur <somnath.ko...@broadcom.com> > > --- > > lib/librte_eal/common/eal_common_dev.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/lib/librte_eal/common/eal_common_dev.c > > b/lib/librte_eal/common/eal_common_dev.c > > index 9e4f09d..311eef5 100644 > > --- a/lib/librte_eal/common/eal_common_dev.c > > +++ b/lib/librte_eal/common/eal_common_dev.c > > @@ -171,6 +171,7 @@ static int cmp_dev_name(const struct rte_device *dev, > > const void *_name) > > * those devargs shouldn't be removed manually anymore. > > */ > > > > + dev->devargs = da; > > ret = dev->bus->plug(dev); > > if (ret > 0) > > ret = -ENOTSUP; > > > > Hello Somnath, > > On a surface level, the fix does not seem correct. > The comment > > "Since there is a matching device, it is now its responsibility > to manage the devargs we've just inserted. From this point, > those devargs should'nt be removed manually anymore." > > means that the err_devarg label is not correct, on further error in the > function, returning the error code and cleaning up the device is sufficient. > > Setting the devargs for a device is the responsibility of the bus scan > function. > In the PCI bus for example, this is done in pci_name_set(), called once a > device name is fully qualified > after scanning the system and thus being able to match a devargs to the > device name. > > Can you please give more information about the device bus, and maybe trace > the path taken > by the line "ret = da->bus->scan();" a few lines above your edit? If your > dev->devargs is not set > afterward, it seems the bug would be there. > Sure, here is the stack trace from the pci_name_set() pci_name_set (dev=0x570eab0) at /root/dpdk-int_nxt/drivers/bus/pci/pci_common.c:65 65 dev->name, sizeof(dev->name)); (gdb) p /x *dev $1 = {next = {tqe_next = 0x0, tqe_prev = 0x0}, device = {next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x0, driver = 0x0, bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = 0x0, bus = 0x6, devid = 0x2, function = 0x1}, id = {class_id = 0x20000, vendor_id = 0x14e4, device_id = 0x16dc, subsystem_vendor_id = 0x14e4, subsystem_device_id = 0x16d7}......... (gdb) bt #0 pci_name_set (dev=0x570eab0) at /root/dpdk19.11/drivers/bus/pci/pci_common.c:65 #1 0x0000000000457b69 in pci_scan_one (dirname=0x7ffdb46ef230 "/sys/bus/pci/devices/0000:06:02.1", addr=0x7ffdb46ef220) at /root/dpdk-19.11/drivers/bus/pci/linux/pci.c:305 #2 0x0000000000458188 in rte_pci_scan () at /root/dpdk-int_nxt/drivers/bus/pci/linux/pci.c:488 #3 0x00000000004b9530 in local_dev_probe (devargs=0x527cc10 "0000:06:02.0,representor=[1]", new_dev=0x7ffdb46f02b8) at /root/dpdk-19.11/lib/librte_eal/common/eal_common_dev.c:158 #4 0x00000000004b9726 in rte_dev_probe (devargs=0x527cc10 "0000:06:02.0,representor=[1]") at /root/dpdk-int_nxt/lib/librte_eal/common/eal_common_dev.c:227 #5 0x00000000033d196a in netdev_dpdk_process_devargs (dev=0x2000e4800, devargs=0x527cc10 "0000:06:02.0,representor=[1]", errp=0x7ffdb46f0478) at lib/netdev-dpdk.c:1798 #6 0x00000000033d1d94 in netdev_dpdk_set_config (netdev=0x2000e4880, args=0x56edf10, errp=0x7ffdb46f0478) at lib/netdev-dpdk.c:1921 #7 0x00000000032c2e48 in netdev_set_config (netdev=0x2000e4880, args=0x56edf10, errp=0x7ffdb46f0598) at lib/netdev.c:494 #8 0x00000000031e9a4e in iface_set_netdev_config (iface_cfg=0x56edc90, netdev=0x2000e4880, errp=0x7ffdb46f0598) at vswitchd/bridge.c:2015 #9 0x00000000031e9bd1 in iface_do_create (br=0x52801d0, iface_cfg=0x56edc90, ofp_portp=0x7ffdb46f05a4, netdevp=0x7ffdb46f05a8, errp=0x7ffdb46f0598) at vswitchd/bridge.c:2049 #10 0x00000000031e9da6 in iface_create (br=0x52801d0, iface_cfg=0x56edc90, port_cfg=0x56bae30) at vswitchd/bridge.c:2100 #11 0x00000000031e7494 in bridge_add_ports__ (br=0x52801d0, wanted_ports=0x52802b0, with_requested_port=false) at vswitchd/bridge.c:1164 #12 0x00000000031e7525 in bridge_add_ports (br=0x52801d0, wanted_ports=0x52802b0) at vswitchd/bridge.c:1180 #13 0x00000000031e6a4f in bridge_reconfigure (ovs_cfg=0x52b2c40) at vswitchd/bridge.c:893 #14 0x00000000031ed37c in bridge_run () at vswitchd/bridge.c:3324 #15 0x00000000031f2977 in main (argc=11, argv=0x7ffdb46f0878) at vswitchd/ovs-vswitchd.c:127 > It could be the bus that is not properly implemented, the devargs name not > fully qualified, a special path taken by a vdev (I'm not sure what the ovsbr0 > device is for example), or a specific representor pluging function, difficult > to pinpoint exactly right now. As you can see , the DBDF along with the 'representor=[1]' is passed all the way down to local_dev_probe(). However when i traced further inside pci_name_set() =>pci_devargs_lookup() static struct rte_devargs *pci_devargs_lookup(struct rte_pci_device *dev) { struct rte_devargs *devargs; struct rte_pci_addr addr; RTE_EAL_DEVARGS_FOREACH("pci", devargs) { devargs->bus->parse(devargs->name, &addr); ======================================> pci_parse() [1] if (!rte_pci_addr_cmp(&dev->addr, &addr)) return devargs; } return NULL; } [1]: This is the value of dev_args at this point in time - p *devargs $2 = {next = {tqe_next = 0x0, tqe_prev = 0x3a67970 <devargs_list>}, type = RTE_DEVTYPE_WHITELISTED_PCI, policy = RTE_DEV_WHITELISTED, name = "0000:06:02.0", '\000' <repeats 51 times>, {args = 0x56eec00 "representor=[1]", drv_str = 0x56eec00 "representor=[1]"}, bus = 0x3a59a00 <rte_pci_bus>, cls = 0x0, bus_str = 0x0, cls_str = 0x0, data = 0x0} It has the 'representor' keyword still intact, but what gets passed to pci_parse() is pci_parse (name=0x56d76b8 "0000:06:02.0", addr=0x7ffdb46ed180) i.e just the original DBDF of the function that was already probed once during OVS-DPDK init... I'm not sure if that is the reason , but eventually it walks the entire for_each loop and returns NULL ...i.e pci_devargs_lookup() returns NULL ... So this is where the 'devargs' value passed is getting lost Hope that gives what you were looking for ? Thanks Som