On 9/30/2016 10:27 AM, Mateusz Guzik wrote:
> Author: mjg
> Date: Fri Sep 30 17:27:17 2016
> New Revision: 306512
> URL: https://svnweb.freebsd.org/changeset/base/306512
> 
> Log:
>   vfs: batch free vnodes in per-mnt lists
>   
>   Previously free vnodes would always by directly returned to the global
>   LRU list. With this change up to mnt_free_list_batch vnodes are collected
>   first.
>   
>   syncer runs always return the batch regardless of its size.
>   
>   While vnodes on per-mnt lists are not counted as free, they can be
>   returned in case of vnode shortage.
>   
>   Reviewed by:        kib
>   Tested by:  pho
> 
> Modified:
>   head/sys/kern/vfs_mount.c
>   head/sys/kern/vfs_subr.c
>   head/sys/sys/mount.h
>   head/sys/sys/vnode.h
> 
...
> @@ -2753,17 +2824,25 @@ _vhold(struct vnode *vp, bool locked)
>        * Remove a vnode from the free list, mark it as in use,
>        * and put it on the active list.
>        */
> -     mtx_lock(&vnode_free_list_mtx);
> -     TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist);
> -     freevnodes--;
> -     vp->v_iflag &= ~VI_FREE;
> +     mp = vp->v_mount;
> +     mtx_lock(&mp->mnt_listmtx);
                  ^^

> +     if ((vp->v_mflag & VMP_TMPMNTFREELIST) != 0) {
> +             TAILQ_REMOVE(&mp->mnt_tmpfreevnodelist, vp, v_actfreelist);
> +             mp->mnt_tmpfreevnodelistsize--;
> +             vp->v_mflag &= ~VMP_TMPMNTFREELIST;
> +     } else {
> +             mtx_lock(&vnode_free_list_mtx);
> +             TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist);
> +             freevnodes--;
> +             mtx_unlock(&vnode_free_list_mtx);
> +     }
>       KASSERT((vp->v_iflag & VI_ACTIVE) == 0,
>           ("Activating already active vnode"));
> +     vp->v_iflag &= ~VI_FREE;
>       vp->v_iflag |= VI_ACTIVE;
> -     mp = vp->v_mount;
>       TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist);
>       mp->mnt_activevnodelistsize++;
> -     mtx_unlock(&vnode_free_list_mtx);
> +     mtx_unlock(&mp->mnt_listmtx);
>       refcount_acquire(&vp->v_holdcnt);
>       if (!locked)
>               VI_UNLOCK(vp);
> @@ -2819,21 +2898,25 @@ _vdrop(struct vnode *vp, bool locked)
>               if ((vp->v_iflag & VI_OWEINACT) == 0) {
>                       vp->v_iflag &= ~VI_ACTIVE;
>                       mp = vp->v_mount;
> -                     mtx_lock(&vnode_free_list_mtx);
> +                     mtx_lock(&mp->mnt_listmtx);
                                  ^^

If code runs getnewvnode() and then immediately runs vhold() or vrele(),
without first running insmntque(vp, mp), then vp->v_mount is NULL here
and the lock/freelist dereferencing just panic.
Locking the vnode and then using vgone() and vput() on it avoids the
issue since it marks the vnode VI_DOOMED and instead frees it in
_vdrop() rather than try to re-add it to its NULL per-mount free list.

I'm not sure what the right thing here is.  Is it a requirement to
insmntque() a new vnode immediately, or to vgone() it before releasing
it if was just returned from getnewvnode()?

Perhaps these cases should assert that v_mount is not NULL or handle the
odd case and just free the vnode instead, or can it still just add to
the global vnode_free_list if mp is NULL?

The old code handled the case fine since the freelist was global and not
per-mount.  'mp' was only dereferenced below if the vnode had been
active, which was not the case for my example.


>                       if (active) {
>                               TAILQ_REMOVE(&mp->mnt_activevnodelist, vp,
>                                   v_actfreelist);
>                               mp->mnt_activevnodelistsize--;
>                       }
> -                     TAILQ_INSERT_TAIL(&vnode_free_list, vp,
> +                     TAILQ_INSERT_TAIL(&mp->mnt_tmpfreevnodelist, vp,
>                           v_actfreelist);
> -                     freevnodes++;
> +                     mp->mnt_tmpfreevnodelistsize++;
>                       vp->v_iflag |= VI_FREE;
> -                     mtx_unlock(&vnode_free_list_mtx);
> +                     vp->v_mflag |= VMP_TMPMNTFREELIST;
> +                     VI_UNLOCK(vp);
> +                     if (mp->mnt_tmpfreevnodelistsize >= mnt_free_list_batch)
> +                             vnlru_return_batch_locked(mp);
> +                     mtx_unlock(&mp->mnt_listmtx);
>               } else {
> +                     VI_UNLOCK(vp);
>                       atomic_add_long(&free_owe_inact, 1);
>               }
> -             VI_UNLOCK(vp);
>               return;
>       }
>       /*


-- 
Regards,
Bryan Drewery

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to