Hi Chenbo & David,

On 2/5/25 8:27 AM, Chenbo Xia wrote:
Hi David,

On Feb 4, 2025, at 21:18, David Marchand <david.march...@redhat.com> wrote:

External email: Use caution opening links or attachments


Hello vhost maintainers,

On Tue, Dec 24, 2024 at 4:50 PM Maxime Coquelin
<maxime.coque...@redhat.com> wrote:

The vhost FD manager provides a way for the read/write
callbacks to request removal of their associated FD from
the epoll FD set. Problem is that it is missing a cleanup
callback, so the read/write callback requesting the removal
have to perform cleanups before the FD is removed from the
FD set. It includes closing the FD before it is removed
from the epoll FD set.

This series introduces a new cleanup callback which, if
implemented, is closed right after the FD is removed from
FD set.

Maxime Coquelin (3):
  vhost: add cleanup callback to FD entries
  vhost: fix vhost-user socket cleanup order
  vhost: improve VDUSE reconnect handler cleanup

lib/vhost/fd_man.c | 16 ++++++++++++----
lib/vhost/fd_man.h |  3 ++-
lib/vhost/socket.c | 46 ++++++++++++++++++++++++++--------------------
lib/vhost/vduse.c  | 16 +++++++++++-----
4 files changed, 51 insertions(+), 30 deletions(-)

I tried this series, and it fixes the error log I reported.

On the other hand, I wonder if we could do something simpler.

The fd is only used by the registered handlers.
If a handler reports that it does not want to watch this fd anymore,
then there is no remaining user in the vhost library for this fd.

So my proposal would be to rename the "remove" flag as a "close" flag:

@@ -12,7 +12,7 @@ struct fdset;

#define MAX_FDS 1024

-typedef void (*fd_cb)(int fd, void *dat, int *remove);
+typedef void (*fd_cb)(int fd, void *dat, int *close);

struct fdset *fdset_init(const char *name);

And defer closing to fd_man.
Something like:

@@ -367,9 +367,9 @@ fdset_event_dispatch(void *arg)
                        pthread_mutex_unlock(&pfdset->fd_mutex);

                        if (rcb && events[i].events & (EPOLLIN |
EPOLLERR | EPOLLHUP))
-                               rcb(fd, dat, &remove1);
+                               rcb(fd, dat, &close1);
                        if (wcb && events[i].events & (EPOLLOUT |
EPOLLERR | EPOLLHUP))
-                               wcb(fd, dat, &remove2);
+                               wcb(fd, dat, &close2);
                        pfdentry->busy = 0;
                        /*
                         * fdset_del needs to check busy flag.
@@ -381,8 +381,10 @@ fdset_event_dispatch(void *arg)
                         * fdentry not to be busy, so we can't call
                         * fdset_del_locked().
                         */
-                       if (remove1 || remove2)
+                       if (close1 || close2) {
                                fdset_del(pfdset, fd);
+                               close(fd);
+                       }
                }

                if (pfdset->destroy)


And the only thing to move out of the socket and vduse handlers is the
close(fd) call.

Like:

@@ -303,7 +303,7 @@ vhost_user_server_new_connection(int fd, void
*dat, int *remove __rte_unused)
}

static void
-vhost_user_read_cb(int connfd, void *dat, int *remove)
+vhost_user_read_cb(int connfd, void *dat, int *close)
{
        struct vhost_user_connection *conn = dat;
        struct vhost_user_socket *vsocket = conn->vsocket;
@@ -313,8 +313,7 @@ vhost_user_read_cb(int connfd, void *dat, int *remove)
        if (ret < 0) {
                struct virtio_net *dev = get_device(conn->vid);

-               close(connfd);
-               *remove = 1;
+               *close = 1;

I have one concern here is compared with this RFC, the proposal changed the 
timing
of close connfd,which means on QEMU side, cleaning up resources will happen 
later.

Currently I can’t think of issues could be introduced by this change (maybe you 
and
Maxime could remind me of something :)

That's a good point.
I just tested David's suggestion with Vhost-user with OVS and QEMU:
- guest shutdown + reconnect
- live-migration
- OVS restart

It seems to behave very well.

Besides this, definitely this proposal is cleaner.

I agree, I will send a new revision re-using David's proposal.

Thanks,
Maxime


Thanks,
Chenbo


                if (dev)
                        vhost_destroy_device_notify(dev);


Maxime, Chenbo, opinions?


--
David Marchand



Reply via email to