[...]
(gdb) p dev2
$5 = (struct rte_pci_device *) 0x54de5e0
(gdb) p /x *dev2
$6 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device =
{next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver =
0x3a6b7f0,
      bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain =
0x0, bus = 0x6, devid = 0x2, function = 0x0}

we hit the pci_device ( 0x54de5e0) that was created during the first
time probe of the parent device ?
Since it is already probed, it goes to line 381 where it does frees
the just allocated 'pci_device'  inside pci_scan_one() via 'free(dev)'

As you can see this pci_device does not have it's devargs set (rightly
so as initially there were no devargs for this device)

But as shown in the stack trace in my previous reply, when
pci_find_device() walks the rte_pci_devices list , it finds this
earlier probed device (without devargs)


Alright, I think you are right, this is what is happening.
The issue IMO, is that the PCI scan is thus hitting an already existing device, 
but we have
missed the case where the new device has more info that the previous one 
(linked devargs).

That is correct

pci_find_device (start=0x0, cmp=0x4b92d8 <cmp_dev_name>,
data=0x55a4af8) at /root/dpdk-19.11/drivers/bus/pci/pci_common.c:426
426                             return &pdev->device;
(gdb) p pdev
$15 = (struct rte_pci_device *) 0x54de5e0

(gdb) p /x *pdev
$16 = {next = {tqe_next = 0x5307460, tqe_prev = 0x5539f80}, device =
{next = {tqe_next = 0x0, tqe_prev = 0x0}, name = 0x54e4f00, driver =
0x3a6b7f0,
      bus = 0x3a59a00, numa_node = 0x0, devargs = 0x0}, addr = {domain =
0x0, bus = 0x6, devid = 0x2, function = 0x0},

Hope this explains why the pci_device has NULL devargs at this point
and how my fix to set it at this point helps solve the problem?

Please let me know your thoughts


I think the proper fix is instead to have a clean update during the PCI scan.
The proper way is to keep the old device, but override its fields as new info 
was found.

Agreed

Something like calling pci_name_set(dev2); line 359 maybe. BSD should also be 
updated for consistency.

My issue with your patch is that I think this issue is very specific to the PCI 
bus and the capabilities
of some devices on it. It would be better to have a fully compliant scan + 
probe ops considering
the supported capabilities, instead of forcing this on all buses.
Sure, so i'm guessing you meant something like this ?

dex 740a2cd..92a41c2 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -377,6 +377,8 @@
                                                  */
                                                 RTE_LOG(ERR, EAL,
"Unexpected device scan at %s!\n",
                                                         filename);
+                                       else if (dev2->device.devargs
!= dev->device.devargs)
+                                               pci_name_set(dev2);
                                 }
                                 free(dev);
                         }

Which is basically checking for devargs mismatch between the 'new'
device and a match with an
already probed device and  return the old device while adjusting the
new info (devargs)
Is that OK?


I'm wondering whether this can happen with an already existing devargs? If so, 
is it more correct to keep
the new one, or ignore the probe, free the new devargs and report an error? If 
the former,
please clean up properly the devargs using rte_devargs_remove() (before calling 
pci_name_set() of course).

Hello Somnath,

Yes, this is basically it, however you also need to take care of a potential 
already existing devargs here, because the PCI bus does not guarantee that all 
devargs of a device will be named the same.

Reply via email to