On 1/9/2017 7:50 AM, Yuanhan Liu wrote: > Assume we have two virtio ports, 00:03.0 and 00:04.0. The first one is > managed by the kernel driver, while the later one is managed by DPDK. > > Now we start the primary process. 00:03.0 will be skipped by DPDK virtio > PMD driver (since it's being used by the kernel). 00:04.0 would be > successfully initiated by DPDK virtio PMD (if nothing abnormal happens). > After that, we would get a port id 0, and all the related info needed > by virtio (virtio_hw) is stored at rte_eth_dev_data[0]. > > Then we start the secondary process. As usual, 00:03.0 will be firstly > probed. It firstly tries to get a local eth_dev structure for it (by > rte_eth_dev_allocate): > > port_id = rte_eth_dev_find_free_port(); > ... > > eth_dev = &rte_eth_devices[port_id]; > eth_dev->data = &rte_eth_dev_data[port_id]; > ... > > return eth_dev; > > Since it's a first PCI device, port_id will be 0. eth_dev->data would > then point to rte_eth_dev_data[0]. And here things start going wrong, > as rte_eth_dev_data[0] actually stores the virtio_hw for 00:04.0. > > That said, in the secondary process, DPDK will continue to drive PCI > device 00.03.0 (despite the fact it's been managed by kernel), with > the info from PCI device 00:04.0. Which is wrong. > > The fix is to attach the port already registered by the primary process: > iterate the rte_eth_dev_data[], and get the port id who's PCI ID matches > the current PCI device. > > This would let us maintain same port ID for the same PCI device, keeping > the chance of referencing to wrong data minimal. > > Fixes: af75078fece3 ("first public release") > > Cc: sta...@dpdk.org > Cc: Thomas Monjalon <thomas.monja...@6wind.com> > Cc: Bruce Richardson <bruce.richard...@intel.com> > Cc: Ferruh Yigit <ferruh.yi...@intel.com> > Signed-off-by: Yuanhan Liu <yuanhan....@linux.intel.com> > --- > > v4: - assign eth_dev in the common eth_dev init help function > it also renamed to eth_dev_get, to not confuse with the > eth_dev_init callback. > - move primoary process specific assignments to rte_eth_dev_allocate > - drop the virtio example in comment > - combine two code block for primary into one > > v3: - do not move rte_eth_dev_data_alloc to pci_probe > - rename eth_dev_attach to eth_dev_attach_secondary > - introduce eth_dev_init() for common eth_dev struct initiation > - move comment block inside the "if" block > --- > lib/librte_ether/rte_ethdev.c | 71 > +++++++++++++++++++++++++++++++++++++------ > 1 file changed, 62 insertions(+), 9 deletions(-) > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index fde8112..1453df1 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -189,6 +189,19 @@ struct rte_eth_dev * > return RTE_MAX_ETHPORTS; > } > > +static struct rte_eth_dev * > +eth_dev_get(uint8_t port_id) > +{ > + struct rte_eth_dev *eth_dev = &rte_eth_devices[port_id]; > + > + eth_dev->data = &rte_eth_dev_data[port_id]; > + eth_dev->attached = DEV_ATTACHED; > + eth_dev_last_created_port = port_id; > + nb_ports++; > + > + return eth_dev; > +} > + > struct rte_eth_dev * > rte_eth_dev_allocate(const char *name) > { > @@ -210,13 +223,41 @@ struct rte_eth_dev * > return NULL; > } > > - eth_dev = &rte_eth_devices[port_id]; > - eth_dev->data = &rte_eth_dev_data[port_id]; > + eth_dev = eth_dev_get(port_id);
There can be a merge issue here, please help me understand. In repo, different from seen here, this patch does this here: - eth_dev = &rte_eth_devices[port_id]; - eth_dev->data = &rte_eth_dev_data[port_id]; - memset(eth_dev->data, 0, sizeof(*eth_dev->data)); + memset(&rte_eth_devices[port_id], 0, sizeof(*eth_dev->data)); + eth_dev = eth_dev_get(port_id); Which no more resets the eth_dev->data, but rte_eth_devices[port_id] (with sizeof(*eth_dev->data)) memset(eth_dev->data) added by Jan Blunck on comment: 7f95f78a8aea ("ethdev: clear data when allocating device") most probably it should stay as "memset(eth_dev->data)", but if not, please aware that commit 7f95f78a8aea removed some assignment from drivers relying this memset, they needs to be added back. > snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name); > eth_dev->data->port_id = port_id; > - eth_dev->attached = DEV_ATTACHED; > - eth_dev_last_created_port = port_id; > - nb_ports++; > + > + return eth_dev; > +} > + > +/* > + * Attach to a port already registered by the primary process, which > + * makes sure that the same device would have the same port id both > + * in the primary and secondary process. > + */ > +static struct rte_eth_dev * > +eth_dev_attach_secondary(const char *name) > +{ > + uint8_t i; > + struct rte_eth_dev *eth_dev; > + > + if (rte_eth_dev_data == NULL) > + rte_eth_dev_data_alloc(); > + > + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { > + if (strcmp(rte_eth_dev_data[i].name, name) == 0) > + break; > + } > + if (i == RTE_MAX_ETHPORTS) { > + RTE_PMD_DEBUG_TRACE( > + "device %s is not driven by the primary process\n", > + name); > + return NULL; > + } > + > + eth_dev = eth_dev_get(i); > + RTE_ASSERT(eth_dev->data->port_id == i); > + > return eth_dev; > } > > @@ -246,16 +287,28 @@ struct rte_eth_dev * > rte_eal_pci_device_name(&pci_dev->addr, ethdev_name, > sizeof(ethdev_name)); > > - eth_dev = rte_eth_dev_allocate(ethdev_name); > - if (eth_dev == NULL) > - return -ENOMEM; > - > if (rte_eal_process_type() == RTE_PROC_PRIMARY) { > + eth_dev = rte_eth_dev_allocate(ethdev_name); > + if (eth_dev == NULL) > + return -ENOMEM; > + > eth_dev->data->dev_private = rte_zmalloc("ethdev private > structure", > eth_drv->dev_private_size, > RTE_CACHE_LINE_SIZE); > if (eth_dev->data->dev_private == NULL) > rte_panic("Cannot allocate memzone for private port > data\n"); > + } else { > + eth_dev = eth_dev_attach_secondary(ethdev_name); > + if (eth_dev == NULL) { > + /* > + * if we failed to attach a device, it means the > + * device is skipped in primary process, due to > + * some errors. If so, we return a positive value, > + * to let EAL skip it for the secondary process > + * as well. > + */ > + return 1; > + } > } > eth_dev->pci_dev = pci_dev; > eth_dev->driver = eth_drv; >