On Mon, Jan 22, 2018 at 12:17:51PM +0000, Stefan Hajnoczi wrote: > On Mon, Jan 22, 2018 at 11:33:46AM +0800, Jason Wang wrote: > > On 2018年01月19日 21:06, Stefan Hajnoczi wrote: > > > These patches implement the virtio-vhost-user device design that I have > > > described here: > > > https://stefanha.github.io/virtio/vhost-user-slave.html#x1-2830007 > > > > Thanks for the patches, looks rather interesting and similar to split device > > model used by Xen. > > > > > > > > The goal is to let the guest act as the vhost device backend for other > > > guests. > > > This allows virtual networking and storage appliances to run inside > > > guests. > > > > So the question still, what kind of protocol do you want to run on top? If > > it was ethernet based, virtio-net work pretty well and it can even do > > migration. > > > > > This device is particularly interesting for poll mode drivers where > > > exitless > > > VM-to-VM communication is possible, completely bypassing the hypervisor > > > in the > > > data path. > > > > It's better to clarify the reason of hypervisor bypassing. (performance, > > security or scalability). > > Performance - yes, definitely. Exitless VM-to-VM is the fastest > possible way to communicate between VMs. Today it can only be done > using ivshmem. This patch series allows virtio devices to take > advantage of it and will encourage people to use virtio instead of > non-standard ivshmem devices. > > Security - I don't think this feature is a security improvement. It > reduces isolation because VM1 has full shared memory access to VM2. In > fact, this is a reason for users to consider carefully whether they > even want to use this feature.
True without an IOMMU, however using a vIOMMU within VM2 can protect the VM2, can't it? > Scalability - much for the same reasons as the Performance section > above. Bypassing the hypervisor eliminates scalability bottlenecks > (e.g. host network stack and bridge). > > > Probably not for the following cases: > > > > 1) kick/call > > I disagree here because kick/call is actually very efficient! > > VM1's irqfd is the ioeventfd for VM2. When VM2 writes to the ioeventfd > there is a single lightweight vmexit which injects an interrupt into > VM1. QEMU is not involved and the host kernel scheduler is not involved > so this is a low-latency operation. > > I haven't tested this yet but the ioeventfd code looks like this will > work. > > > 2) device IOTLB / IOMMU transaction (or any other case that backends needs > > metadata from qemu). > > Yes, this is the big weakness of vhost-user in general. The IOMMU > feature doesn't offer good isolation I think that's an implementation issue, not a protocol issue. > and even when it does, performance > will be an issue. If the IOMMU mappings are dynamic - but they are mostly static with e.g. dpdk, right? > > > * Implement "Additional Device Resources over PCI" for shared memory, > > > doorbells, and notifications instead of hardcoding a BAR with magic > > > offsets into virtio-vhost-user: > > > https://stefanha.github.io/virtio/vhost-user-slave.html#x1-2920007 > > > > Does this mean we need to standardize vhost-user protocol first? > > Currently the draft spec says: > > This section relies on definitions from the Vhost-user Protocol [1]. > > [1] > https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.txt;hb=HEAD > > Michael: Is it okay to simply include this link? It is OK to include normative and non-normative references, they go in the introduction and then you refer to them anywhere in the document. I'm still reviewing the draft. At some level, this is a general tunnel feature, it can tunnel any protocol. That would be one way to isolate it. > > > * Implement the VRING_KICK eventfd - currently vhost-user slaves must > > > be poll > > > mode drivers. > > > * Optimize VRING_CALL doorbell with ioeventfd to avoid QEMU exit. > > > > The performance implication needs to be measured. It looks to me both kick > > and call will introduce more latency form the point of guest. > > I described the irqfd + ioeventfd approach above. It should be faster > than virtio-net + bridge today. > > > > * vhost-user log feature > > > * UUID config field for stable device identification regardless of PCI > > > bus addresses. > > > * vhost-user IOMMU and SLAVE_REQ_FD feature > > > > So an assumption is the VM that implements vhost backends should be at least > > as secure as vhost-user backend process on host. Could we have this > > conclusion? > > Yes. > > Sadly the vhost-user IOMMU protocol feature does not provide isolation. > At the moment IOMMU is basically a layer of indirection (mapping) but > the vhost-user backend process still has full access to guest RAM :(. An important feature would be to do the isolation in the qemu. So trust the qemu running VM2 but not VM2 itself. > > Btw, it's better to have some early numbers, e.g what testpmd reports during > > forwarding. > > I need to rely on others to do this (and many other things!) because > virtio-vhost-user isn't the focus of my work. > > These patches were written to demonstrate my suggestions for vhost-pci. > They were written at work but also on weekends, early mornings, and late > nights to avoid delaying Wei and Zhiyong's vhost-pci work too much. > > If this approach has merit then I hope others will take over and I'll > play a smaller role addressing some of the todo items and cleanups. > > Stefan