On Fri, 07 Mar 2025 10:47:33 +0100
Christian Schoenebeck <qemu_...@crudebyte.com> wrote:

> On Friday, March 7, 2025 10:23:02 AM CET Christian Schoenebeck wrote:
> > This patch fixes two different bugs in v9fs_reclaim_fd():
> > 
> > 1. Reduce latency:
> > 
> > This function calls v9fs_co_close() and v9fs_co_closedir() in a loop. Each
> > one of the calls adds two thread hops (between main thread and a fs driver
> > background thread). Each thread hop adds latency, which sums up in
> > function's loop to a significant duration.
> > 
> > Reduce overall latency by open coding what v9fs_co_close() and
> > v9fs_co_closedir() do, executing those and the loop itself altogether in
> > only one background thread block, hence reducing the total amount of
> > thread hops to only two.
> > 
> > 2. Fix file descriptor leak:
> > 
> > The existing code called v9fs_co_close() and v9fs_co_closedir() to close
> > file descriptors. Both functions check right at the beginning if the 9p
> > request was cancelled:
> > 
> >     if (v9fs_request_cancelled(pdu)) {
> >         return -EINTR;
> >     }
> > 
> > So if client sent a 'Tflush' message, v9fs_co_close() / v9fs_co_closedir()
> > returned without having closed the file descriptor and v9fs_reclaim_fd()
> > subsequently freed the FID without its file descriptor being closed, hence
> > leaking those file descriptors.
> > 
> > This 2nd bug is fixed by this patch as well by open coding v9fs_co_close()
> > and v9fs_co_closedir() inside of v9fs_reclaim_fd() and not performing the
> > v9fs_request_cancelled(pdu) check there.
> > 
> > Fixes: 7a46274529c ('hw/9pfs: Add file descriptor reclaim support')
> > Fixes: bccacf6c792 ('hw/9pfs: Implement TFLUSH operation')
> > Signed-off-by: Christian Schoenebeck <qemu_...@crudebyte.com>
> > ---
> >  hw/9pfs/9p.c | 29 ++++++++++++++++++++---------
> >  1 file changed, 20 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 4f9c2dde9c..80b190ff5b 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -434,6 +434,8 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> >      V9fsFidState *f;
> >      GHashTableIter iter;
> >      gpointer fid;
> > +    int err;
> > +    int nclosed = 0;
> >  
> >      /* prevent multiple coroutines running this function simultaniously */
> >      if (s->reclaiming) {
> > @@ -446,10 +448,10 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> >      QSLIST_HEAD(, V9fsFidState) reclaim_list =
> >          QSLIST_HEAD_INITIALIZER(reclaim_list);
> >  
> > +    /* Pick FIDs to be closed, collect them on reclaim_list. */
> >      while (g_hash_table_iter_next(&iter, &fid, (gpointer *) &f)) {
> >          /*
> > -         * Unlink fids cannot be reclaimed. Check
> > -         * for them and skip them. Also skip fids
> > +         * Unlinked fids cannot be reclaimed, skip those, and also skip 
> > fids
> >           * currently being operated on.
> >           */
> >          if (f->ref || f->flags & FID_NON_RECLAIMABLE) {
> > @@ -499,17 +501,26 @@ void coroutine_fn v9fs_reclaim_fd(V9fsPDU *pdu)
> >          }
> >      }
> >      /*
> > -     * Now close the fid in reclaim list. Free them if they
> > -     * are already clunked.
> > +     * Close the picked FIDs altogether on a background I/O driver thread. 
> > Do
> > +     * this all at once to keep latency (i.e. amount of thread hops 
> > between main
> > +     * thread <-> fs driver background thread) as low as possible.
> >       */
> > +    v9fs_co_run_in_worker({
> > +        QSLIST_FOREACH(f, &reclaim_list, reclaim_next) {
> > +            err = (f->fid_type == P9_FID_DIR) ?
> > +                s->ops->closedir(&s->ctx, &f->fs_reclaim) :
> > +                s->ops->close(&s->ctx, &f->fs_reclaim);
> > +            if (!err) {
> > +                /* total_open_fd must only be mutated on main thread */
> > +                nclosed++;
> > +            }
> > +        }
> > +    });
> > +    total_open_fd -= nclosed;
> 
> So here is another thing: looking at 'man 2 close' I would say that
> decrementing 'total_open_fd' conditionally based on what close() returned is
> wrong. The man page suggest that the return value of close() should only be
> used for diagnostic purposes, as an error on close() often indicates just an
> error on a previous write() and hence the return value should only be used for
> catching a data loss related to writes.
> 
> So this should probably changed here, as well as in v9fs_co_close() /
> v9fs_co_closedir(), part of a separate patch though, so I haven't addressed it
> here yet.
> 
> Does this make sense?
> 

The handling of the return value of `close()` seems to be non-trivial
indeed. It makes sense to address this in some other series.

Cheers,

--
Greg

> /Christian
> 
> > +    /* Free the closed FIDs. */
> >      while (!QSLIST_EMPTY(&reclaim_list)) {
> >          f = QSLIST_FIRST(&reclaim_list);
> >          QSLIST_REMOVE(&reclaim_list, f, V9fsFidState, reclaim_next);
> > -        if (f->fid_type == P9_FID_FILE) {
> > -            v9fs_co_close(pdu, &f->fs_reclaim);
> > -        } else if (f->fid_type == P9_FID_DIR) {
> > -            v9fs_co_closedir(pdu, &f->fs_reclaim);
> > -        }
> >          /*
> >           * Now drop the fid reference, free it
> >           * if clunked.
> > 
> 
> 



-- 
Greg

Reply via email to