On 1/6/21 7:43 AM, Matan Azrad wrote:
> When the vDPA device is closed, the driver polling thread is canceled.
> The polling thread locks the configuration mutex while it polls the CQs.
>
> When the cancellation happens, it may terminate the thread inside the
> critical section what remains the configuration mutex locked.
>
> After device close, the driver may be configured again, in this case,
> for example, when the first queue state is updated, the driver tries to
> lock the mutex again and deadlock appears.
>
> Initialize the mutex after the polling thread cancellation.
>
> Fixes: 99abbd62c272 ("vdpa/mlx5: fix queue update synchronization")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Matan Azrad <ma...@nvidia.com>
> Acked-by: Xueming Li <xuemi...@nvidia.com>
> ---
> drivers/vdpa/mlx5/mlx5_vdpa.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
> index b64f364..0b2f1ab 100644
> --- a/drivers/vdpa/mlx5/mlx5_vdpa.c
> +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
> @@ -295,6 +295,8 @@
> }
> priv->configured = 0;
> priv->vid = 0;
> + /* The mutex may stay locked after event thread cancel - initiate it. */
> + pthread_mutex_init(&priv->vq_config_lock, NULL);
> DRV_LOG(INFO, "vDPA device %d was closed.", vid);
> return ret;
> }
>
I wonder if it would be possible and cleaner to disable cancellation on
the thread while the mutex is held?
Regards,
Maxime