On Wed, Jan 15, 2014 at 01:50:47PM +0100, Antonios Motakis wrote: > > > > On Wed, Jan 15, 2014 at 10:07 AM, Michael S. Tsirkin <m...@redhat.com> wrote: > > On Tue, Jan 14, 2014 at 07:13:43PM +0100, Antonios Motakis wrote: > > > > > > > > On Tue, Jan 14, 2014 at 12:33 PM, Michael S. Tsirkin <m...@redhat.com> > wrote: > > > > On Mon, Jan 13, 2014 at 03:25:11PM +0100, Antonios Motakis wrote: > > > In this patch series we would like to introduce our approach for > putting > > a > > > virtio-net backend in an external userspace process. Our eventual > target > > is to > > > run the network backend in the Snabbswitch ethernet switch, while > > receiving > > > traffic from a guest inside QEMU/KVM which runs an unmodified > virtio-net > > > implementation. > > > > > > For this, we are working into extending vhost to allow equivalent > > functionality > > > for userspace. Vhost already passes control of the data plane of > > virtio-net to > > > the host kernel; we want to realize a similar model, but for > userspace. > > > > > > In this patch series the concept of a vhost-backend is introduced. > > > > > > We define two vhost backend types - vhost-kernel and vhost-user. > The > > former is > > > the interface to the current kernel module implementation. Its > control > > plane is > > > ioctl based. The data plane is the kernel directly accessing the > QEMU > > allocated, > > > guest memory. > > > > > > In the new vhost-user backend, the control plane is based on > > communication > > > between QEMU and another userspace process using a unix domain > socket. > > This > > > allows to implement a virtio backend for a guest running in QEMU, > inside > > the > > > other userspace process. > > > > > > We change -mem-path to QemuOpts and add prealloc, share and unlink > as > > properties > > > to it. HugeTLBFS requirements of -mem-path are relaxed, so any > valid path > > can > > > be used now. The new properties allow more fine grained control > over the > > guest > > > RAM backing store. > > > > > > The data path is realized by directly accessing the vrings and the > buffer > > data > > > off the guest's memory. > > > > > > The current user of vhost-user is only vhost-net. We add new > netdev > > backend > > > that is intended to initialize vhost-net with vhost-user backend. > > > > Some meta comments. > > > > Something that makes this patch harder to review is how it's > > split up. Generally IMHO it's not a good idea to repeatedly > > edit same part of file adding stuff in patch after patch, > > it's only making things harder to read if you add stubs, then fill > them up. > > (we do this sometimes when we are changing existing code, but > > it is generally not needed when adding new code) > > > > Instead, split it like this: > > > > 1. general refactoring, split out linux specific and generic parts > > and add the ops indirection > > 2. add new files for vhost-user with complete implementation. > > without command line to support it, there will be no way to use > it, > > but should build fine. > > 3. tie it all up with option parsing > > > > > > Generic vhost and vhost net files should be kept separate. > > Don't let vhost net stuff seep back into generic files, > > we have vhost-scsi too. > > I would also prefer that userspace vhost has its own files. > > > > > > Ok, we'll keep this into account. > > > > > > > > We need a small test server qemu can talk to, to verify things > > actually work. > > > > > > We have implemented such a test app: https://github.com/ > virtualopensystems/vapp > > > > We use it for testing, and also as a reference implementation. A client > is also > > included. > > > > Sounds good. Can we include this in qemu and tie > it into the qtest framework? > >From a brief look, it merely needs to be tweaked for portability, > unless > > > > > Already commented on: reuse the chardev syntax and preferably code. > > We already support a bunch of options there for > > domain sockets that will be useful here, they should > > work here as well. > > > > > > We adapted the syntax for this to be consistent with chardev. What we > didn't > > use, it is not obvious at all to us on how they should be used; a lot of > the > > chardev options just don't apply to us. > > > > Well server option should work at least. > nowait can work too? > > Also, if reconnect is useful it should be for chardevs too, so if we don't > share code, need to code it in two places to stay consistent. > > Overall sharing some code might be better ... > > > > What you have in mind is to use the functions chardev uses from qemu-sockets.c > right? Chardev itself doesn't look to have anything else that can be shared.
Yes. > The problem with reconnect is that it is implemented at the protocol level; we > are not just transparently reconnecting the socket. So the same approach would > most likely not apply for chardev. Chardev mostly just could use transparent reconnect. vhost-user could use that and get a callback to reconfigure everything after reconnect. Once you write up the protocol in some text file we can discuss this in more detail. For example I wonder how would feature negotiation work with reconnect: new connection could be from another application that does not support same features, but virtio assumes that device features never change. > > > > > In particular you shouldn't require filesystem access by qemu, > > passing fd for domain socket should work. > > > > > > We can add an option to pass an fd for the domain socket if needed. > However as > > far as we understand, chardev doesn't do that either (at least form > looking at > > the man page). Maybe we misunderstand what you mean. > > Sorry. I got confused with e.g. tap which has this. This might be > useful but does not have to block this patch. > > > > > > > > Example usage: > > > > > > qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \ > > > -netdev type=vhost-user,id=net0,path=/path/to/sock,poll_time= > 2500 \ > > > -device virtio-net-pci,netdev=net0 > > > > It's not clear which parts of -mem-path are required for vhost-user. > > It should be documented somewhere, made clear in -help > > and should fail gracefully if misconfigured. > > > > > > > > Ok. > > > > > > > > > > > > Changes from v5: > > > - Split -mem-path unlink option to a separate patch > > > - Fds are passed only in the ancillary data > > > - Stricter message size checks on receive/send > > > - Netdev vhost-user now includes path and poll_time options > > > - The connection probing interval is configurable > > > > > > Changes from v4: > > > - Use error_report for errors > > > - VhostUserMsg has new field `size` indicating the following > payload > > length. > > > Field `flags` now has version and reply bits. The structure is > packed. > > > - Send data is of variable length (`size` field in message) > > > - Receive in 2 steps, header and payload > > > - Add new message type VHOST_USER_ECHO, to check connection > status > > > > > > Changes from v3: > > > - Convert -mem-path to QemuOpts with prealloc, share and unlink > > properties > > > - Set 1 sec timeout when read/write to the unix domain socket > > > - Fix file descriptor leak > > > > > > Changes from v2: > > > - Reconnect when the backend disappears > > > > > > Changes from v1: > > > - Implementation of vhost-user netdev backend > > > - Code improvements > > > > > > Antonios Motakis (8): > > > Convert -mem-path to QemuOpts and add prealloc and share > properties > > > New -mem-path option - unlink. > > > Decouple vhost from kernel interface > > > Add vhost-user skeleton > > > Add domain socket communication for vhost-user backend > > > Add vhost-user calls implementation > > > Add new vhost-user netdev backend > > > Add vhost-user reconnection > > > > > > exec.c | 57 +++- > > > hmp-commands.hx | 4 +- > > > hw/net/vhost_net.c | 144 +++++++--- > > > hw/net/virtio-net.c | 42 ++- > > > hw/scsi/vhost-scsi.c | 13 +- > > > hw/virtio/Makefile.objs | 2 +- > > > hw/virtio/vhost-backend.c | 556 > > ++++++++++++++++++++++++++++++++++++++ > > > hw/virtio/vhost.c | 46 ++-- > > > include/exec/cpu-all.h | 3 - > > > include/hw/virtio/vhost-backend.h | 40 +++ > > > include/hw/virtio/vhost.h | 4 +- > > > include/net/vhost-user.h | 17 ++ > > > include/net/vhost_net.h | 15 +- > > > net/Makefile.objs | 2 +- > > > net/clients.h | 3 + > > > net/hub.c | 1 + > > > net/net.c | 2 + > > > net/tap.c | 16 +- > > > net/vhost-user.c | 177 ++++++++++++ > > > qapi-schema.json | 21 +- > > > qemu-options.hx | 24 +- > > > vl.c | 41 ++- > > > 22 files changed, 1106 insertions(+), 124 deletions(-) > > > create mode 100644 hw/virtio/vhost-backend.c > > > create mode 100644 include/hw/virtio/vhost-backend.h > > > create mode 100644 include/net/vhost-user.h > > > create mode 100644 net/vhost-user.c > > > > > > -- > > > 1.8.3.2 > > > > > > > > >