> From: Bruce Richardson [mailto:bruce.richard...@intel.com] > Sent: Tuesday, 17 January 2023 14.04 > > On Tue, Jan 17, 2023 at 12:32:14PM +0100, Morten Brørup wrote: > > > From: Didier Pallard [mailto:didier.pall...@6wind.com] > > > Sent: Tuesday, 17 January 2023 11.17 > > > > > > Since DPDK 22.11 and below commit: > > > > https://git.dpdk.org/dpdk/commit/?id=7dcd73e37965ba0bfa430efeac362fe183 > > > ed0ae2 > > > rte_cryptodev_socket_id() could return an incorrect value of 255. > > > Problem has been seen during configuration of the qat device > > > on an Atom C3000 architecture. On this arch, PCI is not depending > on > > > any numa socket, causing device numa_node to be equal to > SOCKET_ID_ANY. > > > > Disclaimer: I'm not up to speed with this topic or patch, so feel > free to ignore my comments here! I'm only speaking up because I fear we > are increasing the risk of bugs here. But again, please bear with me, > if I have totally misunderstood this! > > > > I was under the impression that single-socket systems used socket_id > 0 as default. How can the PCI bus (or QAT device) not depend on any > socket? It must be connected somewhere. > > > > Doesn't assigning socket_id = -1 for devices (QAT or anything else) > introduce a big risk of bugs, e.g. in comparisons? The special > socket_id value -1 should have only two meanings: 1) return value > "error", or 2) input value "any". Now it also can mean 3) "unknown"? > How do comparison functions work for that... is "any" == "unknown"? And > does searching for "0" match "unknown"? It might, or might not, but > searching for "any" does match "0". And how about searching for > "unknown", if such a value is propagate around in the system. > > > > And if we started considering socket_id == -1 valid with that patch, > should the return type of rte_socket_id(void) be signed instead of > unsigned? > > > The issue here is that not all PCI endpoints connect directly to a > socket, > some connect to the chipset instead, and so do not have any numa > affinity. > That was the original meaning of the "-1" value, and it came about from > an > era before we had on-die PCI endpoints.
Thank you for elaborating, Bruce. Now I get it! Then it does make sense instantiating devices with socket_id = -1. A minor detail: SOCKET_ID_ANY is defined in lib/eal/include/rte_memory.h [1]. Since it is being used for other purposes than memory allocation, it could move to a more central location. No good ideas from me, but perhaps memory and devices have an appropriate header file in common. [1]: https://elixir.bootlin.com/dpdk/latest/source/lib/eal/include/rte_memory.h#L38 And the multiple purposes of SOCKET_ID_ANY (e.g. what you just described) could be mentioned as a comment with its definition. And I'm still worried about the risk of comparison bugs, e.g. when requesting allocation of a device resource, you cannot specify a preference for a device that is connected to the chipset, because the SOCKET_ID_ANY in the allocation request would be interpreted as "any socket" instead of "no socket". Although that might just be me worrying too much. ;-) -Morten