Hi, > -----Original Message----- > From: Slava Ovsiienko <viachesl...@nvidia.com> > Sent: Tuesday, May 30, 2023 6:13 PM > To: dev@dpdk.org > Cc: Ori Kam <or...@nvidia.com>; Raslan Darawsheh <rasl...@nvidia.com>; > Matan Azrad <ma...@nvidia.com>; sta...@dpdk.org > Subject: [PATCH 1/1] net/mlx5: fix device removal event handling > > On the device removal kernel notifies user space application with queueing the > IBV_DEVICE_FATAL_EVENT and triggering appropriate file descriptor. Mellanox > kernel driver stack emits this event twice from different layers (mlx5 and > uverbs). The IB port index is not applicable in the event structure and should > be ignored for IBV_DEVICE_FATAL_EVENT events. > > Also, on the older kernels (at least from OFED 4.9) there might be race > conditions causing the event queue close before application fetches the > IBV_DEVICE_FATAL_EVENT message with ibv_get_async_event() API. > > To provide the reliable device removal event detection the patch: > > - ignores the IB port index for the IBV_DEVICE_FATAL_EVENT > - introduces the flag to notify PMD about removal only once > - acks event with ibv_ack_async_event after actual handling > - checks for EIO error, making sure queue is not closed yet > > Fixes: 40d9f906f4e2 ("net/mlx5: fix device removal handler for multiport") > Cc: sta...@dpdk.org > > Signed-off-by: Viacheslav Ovsiienko <viachesl...@nvidia.com> > ---
Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh