In this patch series we would like to introduce our approach for putting a virtio-net backend in an external userspace process. Our eventual target is to run the network backend in the Snabbswitch ethernet switch, while receiving traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net implementation.
For this, we are working into extending vhost to allow equivalent functionality for userspace. Vhost already passes control of the data plane of virtio-net to the host kernel; we want to realize a similar model, but for userspace. In this patch series the concept of a vhost-backend is introduced. We define two vhost backend types - vhost-kernel and vhost-user. The former is the interface to the current kernel module implementation. Its control plane is ioctl based. The data plane is the kernel directly accessing the QEMU allocated, guest memory. In the new vhost-user backend, the control plane is based on communication between QEMU and another userspace process using a unix domain socket. This allows to implement a virtio backend for a guest running in QEMU, inside the other userspace process. We change -mem-path to QemuOpts and add prealloc, share and unlink as properties to it. HugeTLBFS requirements of -mem-path are relaxed, so any valid path can be used now. The new properties allow more fine grained control over the guest RAM backing store. The data path is realized by directly accessing the vrings and the buffer data off the guest's memory. The current user of vhost-user is only vhost-net. We add new netdev backend that is intended to initialize vhost-net with vhost-user backend. Example usage: qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \ -netdev type=vhost-user,id=net0,path=/path/to/sock,poll_time=2500 \ -device virtio-net-pci,netdev=net0 Changes from v5: - Split -mem-path unlink option to a separate patch - Fds are passed only in the ancillary data - Stricter message size checks on receive/send - Netdev vhost-user now includes path and poll_time options - The connection probing interval is configurable Changes from v4: - Use error_report for errors - VhostUserMsg has new field `size` indicating the following payload length. Field `flags` now has version and reply bits. The structure is packed. - Send data is of variable length (`size` field in message) - Receive in 2 steps, header and payload - Add new message type VHOST_USER_ECHO, to check connection status Changes from v3: - Convert -mem-path to QemuOpts with prealloc, share and unlink properties - Set 1 sec timeout when read/write to the unix domain socket - Fix file descriptor leak Changes from v2: - Reconnect when the backend disappears Changes from v1: - Implementation of vhost-user netdev backend - Code improvements Antonios Motakis (8): Convert -mem-path to QemuOpts and add prealloc and share properties New -mem-path option - unlink. Decouple vhost from kernel interface Add vhost-user skeleton Add domain socket communication for vhost-user backend Add vhost-user calls implementation Add new vhost-user netdev backend Add vhost-user reconnection exec.c | 57 +++- hmp-commands.hx | 4 +- hw/net/vhost_net.c | 144 +++++++--- hw/net/virtio-net.c | 42 ++- hw/scsi/vhost-scsi.c | 13 +- hw/virtio/Makefile.objs | 2 +- hw/virtio/vhost-backend.c | 556 ++++++++++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 46 ++-- include/exec/cpu-all.h | 3 - include/hw/virtio/vhost-backend.h | 40 +++ include/hw/virtio/vhost.h | 4 +- include/net/vhost-user.h | 17 ++ include/net/vhost_net.h | 15 +- net/Makefile.objs | 2 +- net/clients.h | 3 + net/hub.c | 1 + net/net.c | 2 + net/tap.c | 16 +- net/vhost-user.c | 177 ++++++++++++ qapi-schema.json | 21 +- qemu-options.hx | 24 +- vl.c | 41 ++- 22 files changed, 1106 insertions(+), 124 deletions(-) create mode 100644 hw/virtio/vhost-backend.c create mode 100644 include/hw/virtio/vhost-backend.h create mode 100644 include/net/vhost-user.h create mode 100644 net/vhost-user.c -- 1.8.3.2