On 05/02/2020 09:52, Somnath Kotur wrote:
Hello Gaetan,


On Tue, Feb 4, 2020 at 3:19 PM Gaetan Rivet <gr...@u256.net> wrote:

On 04/02/2020 10:15, Somnath Kotur wrote:
As per the comments in this code section, "since there is a matching device,
it is now its responsibility to manage the devargs we've just inserted."
But the matching device ptr's devargs is still uninitialized or not pointing
to the newest dev_args that were passed as a parameter to local_dev_probe().
This is needed particularly in the case when *probe is called again* on an
already probed device(the parent device for the representor) as part of adding
a representor port to an OVS switch(OVS-DPDK) like so:
ovs-vsctl add-port ovsbr0 vfrep1 -- set Interface vfrep1 type=dpdk \
options:dpdk-devargs=0000:06:02.0,representor=[1]

Fixes: 7e8b26650146 ("eal: fix hotplug add / remove")

Signed-off-by: Somnath Kotur <somnath.ko...@broadcom.com>
---
   lib/librte_eal/common/eal_common_dev.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index 9e4f09d..311eef5 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -171,6 +171,7 @@ static int cmp_dev_name(const struct rte_device *dev, const 
void *_name)
        * those devargs shouldn't be removed manually anymore.
        */

+     dev->devargs = da;
       ret = dev->bus->plug(dev);
       if (ret > 0)
               ret = -ENOTSUP;


Hello Somnath,

On a surface level, the fix does not seem correct.
The comment

     "Since there is a matching device, it is now its responsibility
      to manage the devargs we've just inserted. From this point,
      those devargs should'nt be removed manually anymore."

means that the err_devarg label is not correct, on further error in the
function, returning the error code and cleaning up the device is sufficient.

Setting the devargs for a device is the responsibility of the bus scan function.
In the PCI bus for example, this is done in pci_name_set(), called once a 
device name is fully qualified
after scanning the system and thus being able to match a devargs to the device 
name.

Can you please give more information about the device bus, and maybe trace the 
path taken
by the line "ret = da->bus->scan();" a few lines above your edit? If your 
dev->devargs is not set
afterward, it seems the bug would be there.

Sure, here is the stack trace from the pci_name_set()
pci_name_set (dev=0x570eab0) at
/root/dpdk-int_nxt/drivers/bus/pci/pci_common.c:65
65                              dev->name, sizeof(dev->name));
(gdb) p /x *dev
$1 = {next = {tqe_next = 0x0, tqe_prev = 0x0}, device = {next =
{tqe_next = 0x0, tqe_prev = 0x0}, name = 0x0, driver = 0x0, bus =
0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain = 0x0, bus
= 0x6, devid = 0x2, function = 0x1}, id = {class_id = 0x20000,
vendor_id = 0x14e4,
     device_id = 0x16dc, subsystem_vendor_id = 0x14e4,
subsystem_device_id = 0x16d7}.........
(gdb) bt
#0  pci_name_set (dev=0x570eab0) at
/root/dpdk19.11/drivers/bus/pci/pci_common.c:65
#1  0x0000000000457b69 in pci_scan_one (dirname=0x7ffdb46ef230
"/sys/bus/pci/devices/0000:06:02.1", addr=0x7ffdb46ef220) at
/root/dpdk-19.11/drivers/bus/pci/linux/pci.c:305
#2  0x0000000000458188 in rte_pci_scan () at
/root/dpdk-int_nxt/drivers/bus/pci/linux/pci.c:488
#3  0x00000000004b9530 in local_dev_probe (devargs=0x527cc10
"0000:06:02.0,representor=[1]", new_dev=0x7ffdb46f02b8) at
/root/dpdk-19.11/lib/librte_eal/common/eal_common_dev.c:158
#4  0x00000000004b9726 in rte_dev_probe (devargs=0x527cc10
"0000:06:02.0,representor=[1]") at
/root/dpdk-int_nxt/lib/librte_eal/common/eal_common_dev.c:227
#5  0x00000000033d196a in netdev_dpdk_process_devargs
(dev=0x2000e4800, devargs=0x527cc10 "0000:06:02.0,representor=[1]",
errp=0x7ffdb46f0478) at lib/netdev-dpdk.c:1798
#6  0x00000000033d1d94 in netdev_dpdk_set_config (netdev=0x2000e4880,
args=0x56edf10, errp=0x7ffdb46f0478) at lib/netdev-dpdk.c:1921
#7  0x00000000032c2e48 in netdev_set_config (netdev=0x2000e4880,
args=0x56edf10, errp=0x7ffdb46f0598) at lib/netdev.c:494
#8  0x00000000031e9a4e in iface_set_netdev_config
(iface_cfg=0x56edc90, netdev=0x2000e4880, errp=0x7ffdb46f0598) at
vswitchd/bridge.c:2015
#9  0x00000000031e9bd1 in iface_do_create (br=0x52801d0,
iface_cfg=0x56edc90, ofp_portp=0x7ffdb46f05a4, netdevp=0x7ffdb46f05a8,
errp=0x7ffdb46f0598) at vswitchd/bridge.c:2049
#10 0x00000000031e9da6 in iface_create (br=0x52801d0,
iface_cfg=0x56edc90, port_cfg=0x56bae30) at vswitchd/bridge.c:2100
#11 0x00000000031e7494 in bridge_add_ports__ (br=0x52801d0,
wanted_ports=0x52802b0, with_requested_port=false) at
vswitchd/bridge.c:1164
#12 0x00000000031e7525 in bridge_add_ports (br=0x52801d0,
wanted_ports=0x52802b0) at vswitchd/bridge.c:1180
#13 0x00000000031e6a4f in bridge_reconfigure (ovs_cfg=0x52b2c40) at
vswitchd/bridge.c:893
#14 0x00000000031ed37c in bridge_run () at vswitchd/bridge.c:3324
#15 0x00000000031f2977 in main (argc=11, argv=0x7ffdb46f0878) at
vswitchd/ovs-vswitchd.c:127

It could be the bus that is not properly implemented, the devargs name not 
fully qualified, a special path taken by a vdev (I'm not sure what the ovsbr0 
device is for example), or a specific representor pluging function, difficult 
to pinpoint exactly right now.
As you can see , the DBDF along with the 'representor=[1]' is passed
all the way down to local_dev_probe(). However when i traced further
inside pci_name_set() =>pci_devargs_lookup()

static struct rte_devargs *pci_devargs_lookup(struct rte_pci_device *dev)
{
         struct rte_devargs *devargs;
         struct rte_pci_addr addr;

         RTE_EAL_DEVARGS_FOREACH("pci", devargs) {
                 devargs->bus->parse(devargs->name, &addr);
======================================> pci_parse() [1]

                 if (!rte_pci_addr_cmp(&dev->addr, &addr))
                         return devargs;
         }
         return NULL;
}

[1]: This is the value of dev_args at this point in time -
p *devargs
$2 = {next = {tqe_next = 0x0, tqe_prev = 0x3a67970 <devargs_list>},
type = RTE_DEVTYPE_WHITELISTED_PCI, policy = RTE_DEV_WHITELISTED, name
= "0000:06:02.0", '\000' <repeats 51 times>, {args = 0x56eec00
"representor=[1]", drv_str = 0x56eec00 "representor=[1]"},
   bus = 0x3a59a00 <rte_pci_bus>, cls = 0x0, bus_str = 0x0, cls_str =
0x0, data = 0x0}

It has the 'representor' keyword still intact, but what gets passed to
pci_parse() is pci_parse (name=0x56d76b8 "0000:06:02.0",
addr=0x7ffdb46ed180) i.e just the original DBDF of the function
that was already probed once during OVS-DPDK init... I'm not sure if
that is the reason , but eventually it walks the entire for_each loop
and returns NULL ...i.e pci_devargs_lookup() returns NULL ... So this
is where the 'devargs' value passed is getting lost
Hope that gives what you were looking for ?


Thanks
Som


Hello,

Yes, thank you for the thorough response.
Your second device has the BDF 06:02.01 while its parent has 06:02.00. The 
parent device already matched the devargs and its rte_device should already 
point to it.

I think it is not correct to force link a parent device devargs to virtual 
functions spawned from it.
The devargs is NULL, because there are none that were used to spawn the virtual 
functions.

Now, can you describe your overarching issue more precisely? You are able to 
call rte_dev_probe twice on the same virtual function, because the check 
against it does not work without the devargs being linked? If that's correct, 
this is the real bug IMO, and there could be other ways to mitigate this.

Thanks
Gaetan

Reply via email to