On Wed, 2021-10-13 at 12:06 +0200, Maxime Coquelin wrote: > > On 9/23/21 10:17, Xueming Li wrote: > > VAR is the device memory space for the virtio queues doorbells, qemu > > could mmap it to directly to speed up doorbell push. > > > > On a busy system, Qemu takes time to release VAR resources during driver > > shutdown. If vdpa restarted quickly, the VAR allocation failed with > > error 28 since the VAR is singleton resource per device. > > > > This patch adds retry mechanism for VAR allocation. > > > > Signed-off-by: Xueming Li <xuemi...@nvidia.com> > > Reviewed-by: Matan Azrad <ma...@nvidia.com> > > --- > > drivers/vdpa/mlx5/mlx5_vdpa.c | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c > > index 6d17d7a6f3..991739e984 100644 > > --- a/drivers/vdpa/mlx5/mlx5_vdpa.c > > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c > > @@ -693,7 +693,14 @@ mlx5_vdpa_dev_probe(struct rte_device *dev) > > if (attr.num_lag_ports == 0) > > priv->num_lag_ports = 1; > > priv->ctx = ctx; > > - priv->var = mlx5_glue->dv_alloc_var(ctx, 0); > > + for (retry = 0; retry < 7; retry++) { > > + priv->var = mlx5_glue->dv_alloc_var(ctx, 0); > > + if (priv->var != NULL) > > + break; > > + DRV_LOG(WARNING, "Failed to allocate VAR, retry %d.\n", retry); > > + /* Wait Qemu release VAR during vdpa restart, 0.1 sec based. */ > > + usleep(100000U << retry); > > + } > > if (!priv->var) { > > DRV_LOG(ERR, "Failed to allocate VAR %u.", errno); > > goto error; > > > > That looks fragile, but at least we have a warning we can rely on. > Shouldn't we have a way to wait for Qemu to release the resources at > vdpa driver shutdown time?
If dpdk-vdpa get killed and restart, qemu will shutdown device and unmap the resources independently. > > Also as on patch 1, please add a Fixes tag it you want it to be > backported. Agree to backport, but not a fix, I'll add cc:sta...@dpdk.org, the patch will be noticed by maintainer, thanks for the suggestion! > > Regards, > Maxime >