From: Herbert Xu <herb...@gondor.apana.org.au> Sent: Friday, January 10, 2025 
1:28 AM
> 
> On Thu, Jan 09, 2025 at 02:15:17AM -0800, Breno Leitao wrote:
> >
> > I would suggest we revert this patch until we investigate further. I'll
> > prepare and send a revert patch shortly.
> 
> Sorry, I think it was my addition that broke things.  The condition
> for checking whether an entry is inserted is incorrect, thus resulting
> in an underflow of the number of entries after entry removal.
> 
> Please test this patch:
> 
> ---8<---
> The function rhashtable_insert_one only returns NULL iff the
> insertion was successful, so that alone should be tested before
> increment nelems.  Testing the variable data is redundant, and
> buggy because we will have overwritten the original value of data
> by this point.
> 
> Reported-by: Michael Kelley <mhkli...@outlook.com>
> Fixes: e1d3422c95f0 ("rhashtable: Fix potential deadlock by moving 
> schedule_work
> outside lock")
> Signed-off-by: Herbert Xu <herb...@gondor.apana.org.au>
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index bf956b85455a..e196b6f0e35a 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -621,7 +621,7 @@ static void *rhashtable_try_insert(struct rhashtable *ht, 
> const
> void *key,
> 
>                       rht_unlock(tbl, bkt, flags);
> 
> -                     if (PTR_ERR(data) == -ENOENT && !new_tbl) {
> +                     if (!new_tbl) {
>                               atomic_inc(&ht->nelems);
>                               if (rht_grow_above_75(ht, tbl))
>                                       schedule_work(&ht->run_work);
> --

This patch fixes the problem I saw with VMs in the Azure cloud.  Thanks!

Michael Kelley

Reply via email to