On Mon, Jan 25, 2010 at 01:58:08PM -0600, Anthony Liguori wrote: > On 01/11/2010 11:23 AM, Michael S. Tsirkin wrote: >> start/stop backend on driver start/stop >> >> Signed-off-by: Michael S. Tsirkin<m...@redhat.com> >> --- >> hw/virtio-net.c | 40 ++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 40 insertions(+), 0 deletions(-) >> >> diff --git a/hw/virtio-net.c b/hw/virtio-net.c >> index c2a389f..99169e1 100644 >> --- a/hw/virtio-net.c >> +++ b/hw/virtio-net.c >> @@ -17,6 +17,7 @@ >> #include "net/tap.h" >> #include "qemu-timer.h" >> #include "virtio-net.h" >> +#include "vhost_net.h" >> >> #define VIRTIO_NET_VM_VERSION 11 >> >> @@ -47,6 +48,7 @@ typedef struct VirtIONet >> uint8_t nomulti; >> uint8_t nouni; >> uint8_t nobcast; >> + uint8_t vhost_started; >> struct { >> int in_use; >> int first_multi; >> @@ -114,6 +116,10 @@ static void virtio_net_reset(VirtIODevice *vdev) >> n->nomulti = 0; >> n->nouni = 0; >> n->nobcast = 0; >> + if (n->vhost_started) { >> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); >> + n->vhost_started = 0; >> + } >> >> /* Flush any MAC and VLAN filter table state */ >> n->mac_table.in_use = 0; >> @@ -820,6 +826,36 @@ static NetClientInfo net_virtio_info = { >> .link_status_changed = virtio_net_set_link_status, >> }; >> >> +static void virtio_net_set_status(struct VirtIODevice *vdev) >> +{ >> + VirtIONet *n = to_virtio_net(vdev); >> + if (!n->nic->nc.peer) { >> + return; >> + } >> + if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) { >> + return; >> + } >> + >> + if (!tap_get_vhost_net(n->nic->nc.peer)) { >> + return; >> + } >> + if (!!n->vhost_started == !!(vdev->status& VIRTIO_CONFIG_S_DRIVER_OK)) >> { >> + return; >> + } >> + if (vdev->status& VIRTIO_CONFIG_S_DRIVER_OK) { >> + int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), vdev); >> + if (r< 0) { >> + fprintf(stderr, "unable to start vhost net: " >> + "falling back on userspace virtio\n"); >> + } else { >> + n->vhost_started = 1; >> + } >> + } else { >> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); >> + n->vhost_started = 0; >> + } >> +} >> + >> > > This function does a number of bad things. It makes virtio-net have > specific knowledge of backends (like tap) and then has virtio-net pass > device specific state (vdev) to a network backend. > > Ultimately, the following things need to happen: > > 1) when a driver is ready to begin operating: > a) virtio-net needs to tell vhost the location of the ring in physical > memory > b) virtio-net needs to tell vhost about any notification it receives > (allowing kvm to shortcut userspace) > c) virtio-net needs to tell vhost about which irq is being used > (allowing kvm to shortcut userspace) > > What this suggests is that we need an API for the network backends to do > the following: > > 1) probe whether ring passthrough is supported > 2) set the *virtual* address of the ring elements > 3) hand the backend some sort of notification object for sending and > receiving notifications > 4) stop ring passthrough
Look at how vnet_hdr is setup: frontend probes backend, and has 'if (backend has vnet header) blabla' vhost is really very similar, so I would like to do it in the same way. Generally I do not believe in abstractions that only have one implementation behind it: you only know how to abstract interface after you have more than one implementation. So whoever writes another frontend that can use vhost will be in a better position to add infrastructure to abstract both that new thing and virtio. > vhost should not need any direct knowledge of the device. This > interface should be enough to communicating the required data. I think > the only bit that is a little fuzzy and perhaps non-obvious for the > current patches is the notification object. I would expect it look > something like: > > typedef struct IOEvent { > int type; > void (*notify)(IOEvent *); > void (*on_notify)(IOEvent *, void (*cb)(IOEvent *, void *)); > } IOEvent; > And then we would have > > typedef struct KVMIOEvent { > IOEvent event = {.type = KVM_IO_EVENT}; > int fd; > } KVMIOEvent; > > if (kvm_enabled()) in virtio-net, we would allocate a KVMIOEvent for the > PIO notification and hand that to vhost. vhost would check event.type > and if it's KVM_IOEVENT, downcast and use that to get at the file > descriptor. Since we don't have any other IOEvents, I just put the fd in the generic structure directly. If other types surface we'll see how to generalize it. > The KVMIOEvent should be created, not in the set status callback, but in > the PCI map callback. And in the PCI map callback, cpu_single_env > should be passed to a kvm specific function to create this KVMIOEvent > object that contains the created eventfd() that's handed to kvm via > ioctl. So this specific thing does not work very well because with irqchip, we want to bypass notification and send irq directly from kernel. So I created a structure but it does not have callbacks, only the fd. > It doesn't have to be exactly like this, but the point of all of this is > that these KVM specific mechanism (which are really implementation > details) should not be pervasive throughout the QEMU interfaces. > There > should also be strong separation between the vhost-net code and the > virtio-net device. > > Regards, > > Anthony Liguori > I don't see the point of this last idea. vhost is virtio accelerator, not a generic network backend. Whoever wants to use vhost for non-virtio frontends will have to add infrastructure for this separation, but I do not believe it's practical or desirable. > >> VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) >> { >> VirtIONet *n; >> @@ -835,6 +871,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf >> *conf) >> n->vdev.set_features = virtio_net_set_features; >> n->vdev.bad_features = virtio_net_bad_features; >> n->vdev.reset = virtio_net_reset; >> + n->vdev.set_status = virtio_net_set_status; >> n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx); >> n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx); >> n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl); >> @@ -864,6 +901,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf >> *conf) >> void virtio_net_exit(VirtIODevice *vdev) >> { >> VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev); >> + if (n->vhost_started) { >> + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); >> + } >> >> qemu_purge_queued_packets(&n->nic->nc); >> >>