Hi On Mon, Jan 16, 2017 at 12:37 PM Jan Kiszka <jan.kis...@siemens.com> wrote:
> Hi, > > some of you may know that we are using a shared memory device similar to > ivshmem in the partitioning hypervisor Jailhouse [1]. > > We started as being compatible to the original ivshmem that QEMU > implements, but we quickly deviated in some details, and in the recent > months even more. Some of the deviations are related to making the > implementation simpler. The new ivshmem takes <500 LoC - Jailhouse is > aiming at safety critical systems and, therefore, a small code base. > Other changes address deficits in the original design, like missing > life-cycle management. > > Now the question is if there is interest in defining a common new > revision of this device and maybe also of some protocols used on top, > such as virtual network links. Ideally, this would enable us to share > Linux drivers. We will definitely go for upstreaming at least a network > driver such as [2], a UIO driver and maybe also a serial port/console. > > This sounds like duplicating efforts done with virtio and vhost-pci. Have you looked at Wei Wang proposal? I've attached a first draft of the specification of our new ivshmem > device. A working implementation can be found in the wip/ivshmem2 branch > of Jailhouse [3], the corresponding ivshmem-net driver in [4]. > You don't have qemu branch, right? > > Deviations from the original design: > > - Only two peers per link > > sound sane, that's also what vhost-pci aims to afaik > This simplifies the implementation and also the interfaces (think of > life-cycle management in a multi-peer environment). Moreover, we do > not have an urgent use case for multiple peers, thus also not > reference for a protocol that could be used in such setups. If someone > else happens to share such a protocol, it would be possible to discuss > potential extensions and their implications. > > - Side-band registers to discover and configure share memory regions > > This was one of the first changes: We removed the memory regions from > the PCI BARs and gave them special configuration space registers. By > now, these registers are embedded in a PCI capability. The reasons are > that Jailhouse does not allow to relocate the regions in guest address > space (but other hypervisors may if they like to) and that we now have > up to three of them. > Sorry, I can't comment on that. > - Changed PCI base class code to 0xff (unspecified class) > > This allows us to define our own sub classes and interfaces. That is > now exploited for specifying the shared memory protocol the two > connected peers should use. It also allows the Linux drivers to match > on that. > > Why not, but it worries me that you are going to invent protocols similar to virtio devices, aren't you? > - INTx interrupts support is back > > This is needed on target platforms without MSI controllers, i.e. > without the required guest support. Namely some PCI-less ARM SoCs > required the reintroduction. While doing this, we also took care of > keeping the MMIO registers free of privileged controls so that a > guest OS can map them safely into a guest userspace application. > > Right, it's not completely removed from ivshmem qemu upstream, although it should probably be allowed to setup a doorbell-ivshmem with msi=off (this may be quite trivial to add back) > And then there are some extensions of the original ivshmem: > > - Multiple shared memory regions, including unidirectional ones > > It is now possible to expose up to three different shared memory > regions: The first one is read/writable for both sides. The second > region is read/writable for the local peer and read-only for the > remote peer (useful for output queues). And the third is read-only > locally but read/writable remotely (ie. for input queues). > Unidirectional regions prevent that the receiver of some data can > interfere with the sender while it is still building the message, a > property that is not only useful for safety critical communication, > we are sure. > Sounds like a good idea, and something we may want in virtio too > > - Life-cycle management via local and remote state > > Each device can now signal its own state in form of a value to the > remote side, which triggers an event there. Moreover, state changes > done by the hypervisor to one peer are signalled to the other side. > And we introduced a write-to-shared-memory mechanism for the > respective remote state so that guests do not have to issue an MMIO > access in order to check the state. > There is also ongoing work to better support disconnect/reconnect in virtio. > > So, this is our proposal. Would be great to hear some opinions if you > see value in adding support for such an "ivshmem 2.0" device to QEMU as > well and expand its ecosystem towards Linux upstream, maybe also DPDK > again. If you see problems in the new design /wrt what QEMU provides so > far with its ivshmem device, let's discuss how to resolve them. Looking > forward to any feedback! > > My feeling is that ivshmem is not being actively developped in qemu, but rather virtio-based solutions (vhost-pci for vm2vm). Jan > > [1] https://github.com/siemens/jailhouse > [2] > > http://git.kiszka.org/?p=linux.git;a=blob;f=drivers/net/ivshmem-net.c;h=0e770ca293a4aca14a55ac0e66871b09c82647af;hb=refs/heads/queues/jailhouse > [3] https://github.com/siemens/jailhouse/commits/wip/ivshmem2 > [4] > > http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/jailhouse-ivshmem2 > > -- > Siemens AG, Corporate Technology, CT RDA ITP SES-DE > Corporate Competence Center Embedded Linux > -- Marc-André Lureau