When allocating memory for an ethdev, the rte_malloc_socket call used only allocates memory on the NUMA node/socket local to the device. This means that even if the user wanted to, they could never use a remote NIC without also having memory on that NIC's socket.
For example, if we change examples/skeleton/basicfwd.c to have SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should be able to run the app cross-numa e.g. as below, where the two PCI devices are on socket 1, and core 1 is on socket 0: ./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \ -a a8:00.0 -a b8:00.0 This fails however, with the error: ETHDEV: failed to allocate private data PCI_BUS: Requested device 0000:a8:00.0 cannot be used We can remove this restriction by doing a fallback call to general rte_malloc after a call to rte_malloc_socket fails. This should be safe to do because the later ethdev calls to setup Rx/Tx queues all take a socket_id parameter, which can be used by applications to enforce the requirement for local-only memory for a device, if so desired. [If device-local memory is present it will be used as before, while if not present the rte_eth_dev_configure call will now pass, but the subsequent queue setup calls requesting local memory will fail]. Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs") Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers") Cc: sta...@dpdk.org Signed-off-by: Bruce Richardson <bruce.richard...@intel.com> Signed-off-by: Padraig Connolly <padraig.j.conno...@intel.com> --- V2: * Add warning printout in the case where we don't get device-local memory, but we do get memory on another socket. --- lib/ethdev/ethdev_driver.c | 20 +++++++++++++++----- lib/ethdev/ethdev_pci.h | 20 +++++++++++++++++--- 2 files changed, 32 insertions(+), 8 deletions(-) diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index f48c0eb8bc..c335a25a82 100644 --- a/lib/ethdev/ethdev_driver.c +++ b/lib/ethdev/ethdev_driver.c @@ -303,15 +303,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name, return -ENODEV; if (priv_data_size) { + /* try alloc private data on device-local node. */ ethdev->data->dev_private = rte_zmalloc_socket( name, priv_data_size, RTE_CACHE_LINE_SIZE, device->numa_node); - if (!ethdev->data->dev_private) { - RTE_ETHDEV_LOG_LINE(ERR, - "failed to allocate private data"); - retval = -ENOMEM; - goto probe_failed; + /* fall back to alloc on any socket on failure */ + if (ethdev->data->dev_private == NULL) { + ethdev->data->dev_private = rte_zmalloc(name, + priv_data_size, RTE_CACHE_LINE_SIZE); + + if (ethdev->data->dev_private == NULL) { + RTE_ETHDEV_LOG_LINE(ERR, "failed to allocate private data"); + retval = -ENOMEM; + goto probe_failed; + } + /* got memory, but not local, so issue warning */ + RTE_ETHDEV_LOG_LINE(WARNING, + "Private data for ethdev '%s' not allocated on local NUMA node %d", + device->name, device->numa_node); } } } else { diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h index 737fff1833..ec4f731270 100644 --- a/lib/ethdev/ethdev_pci.h +++ b/lib/ethdev/ethdev_pci.h @@ -93,12 +93,26 @@ rte_eth_dev_pci_allocate(struct rte_pci_device *dev, size_t private_data_size) return NULL; if (private_data_size) { + /* Try and alloc the private-data structure on socket local to the device */ eth_dev->data->dev_private = rte_zmalloc_socket(name, private_data_size, RTE_CACHE_LINE_SIZE, dev->device.numa_node); - if (!eth_dev->data->dev_private) { - rte_eth_dev_release_port(eth_dev); - return NULL; + + /* if cannot allocate memory on the socket local to the device + * use rte_malloc to allocate memory on some other socket, if available. + */ + if (eth_dev->data->dev_private == NULL) { + eth_dev->data->dev_private = rte_zmalloc(name, + private_data_size, RTE_CACHE_LINE_SIZE); + + if (eth_dev->data->dev_private == NULL) { + rte_eth_dev_release_port(eth_dev); + return NULL; + } + /* got memory, but not local, so issue warning */ + RTE_ETHDEV_LOG_LINE(WARNING, + "Private data for ethdev '%s' not allocated on local NUMA node %d", + dev->device.name, dev->device.numa_node); } } } else { -- 2.43.0