> From: Didier Pallard [mailto:didier.pall...@6wind.com] > Sent: Tuesday, 17 January 2023 11.17 > > Since DPDK 22.11 and below commit: > https://git.dpdk.org/dpdk/commit/?id=7dcd73e37965ba0bfa430efeac362fe183 > ed0ae2 > rte_cryptodev_socket_id() could return an incorrect value of 255. > Problem has been seen during configuration of the qat device > on an Atom C3000 architecture. On this arch, PCI is not depending on > any numa socket, causing device numa_node to be equal to SOCKET_ID_ANY.
Disclaimer: I'm not up to speed with this topic or patch, so feel free to ignore my comments here! I'm only speaking up because I fear we are increasing the risk of bugs here. But again, please bear with me, if I have totally misunderstood this! I was under the impression that single-socket systems used socket_id 0 as default. How can the PCI bus (or QAT device) not depend on any socket? It must be connected somewhere. Doesn't assigning socket_id = -1 for devices (QAT or anything else) introduce a big risk of bugs, e.g. in comparisons? The special socket_id value -1 should have only two meanings: 1) return value "error", or 2) input value "any". Now it also can mean 3) "unknown"? How do comparison functions work for that... is "any" == "unknown"? And does searching for "0" match "unknown"? It might, or might not, but searching for "any" does match "0". And how about searching for "unknown", if such a value is propagate around in the system. And if we started considering socket_id == -1 valid with that patch, should the return type of rte_socket_id(void) be signed instead of unsigned? > Due to incorrect cast to uint8_t, this value is stored as 255 > in cryptodev structure and returned as such by > rte_cryptodev_socket_id() > function. > > Below patch proposes one way to fix the issue: casting to a signed > int8_t > instead of the uint8_t. (it could also be casted to an int, that is the > usual type for numa_node, but this may break the ABI). This makes the > SOCKET_ID_ANY being propagated up to the user. > Another solution could be to always store a valid numa_node in this > field > instead of just copying the numa_node field of the device, but this > requires to fix most crypto PMDs, that are currently just copying the > device value. > > What is the preferred solution? I prefer that garbage data is not propagated, even if it requires fixing many places. But as I indicated above, I wonder if part of the root cause stems from considering socket_id == -1 valid data in some structures. (It could be allowed temporarily, e.g. in a template to indicate that the field is uninitialized. But it should not propagate outside the templates when instantiating objects based on the templates.) > > --- > cryptodev: fix numa_node type > > Since below commit, numa_node can be set to SOCKET_ID_ANY. > Do not cast numa_node to an unsigned uint8, else SOCKET_ID_ANY > is converted to 255, causing rte_cryptodev_socket_id to return > an incorrect value. > > Fixes: 7dcd73e37965 ("drivers/bus: set device NUMA node to unknown by > default") > Signed-off-by: Didier Pallard <didier.pall...@6wind.com> > --- > lib/cryptodev/cryptodev_pmd.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/cryptodev/cryptodev_pmd.h > b/lib/cryptodev/cryptodev_pmd.h > index 0020102eb7db..db4745d620f4 100644 > --- a/lib/cryptodev/cryptodev_pmd.h > +++ b/lib/cryptodev/cryptodev_pmd.h > @@ -64,8 +64,8 @@ struct rte_cryptodev_pmd_init_params { > struct rte_cryptodev_data { > /** Device ID for this instance */ > uint8_t dev_id; > - /** Socket ID where memory is allocated */ > - uint8_t socket_id; > + /** Socket ID of the device */ > + int8_t socket_id; > /** Unique identifier name */ > char name[RTE_CRYPTODEV_NAME_MAX_LEN]; > > -- > 2.30.2 >