Hi all, This RFC patch-series introduces the set of changes enabling the architectural elements to model the architecture presented in a previous RFC letter: "[Qemu-devel][RFC] Towards an Heterogeneous QEMU".
To recap the goal of such RFC: The idea is to enhance the current architecture of QEMU to enable the modeling of a state of-the-art SoC with an AMP processing style, where different processing units share the same system memory and communicate through shared memory and inter-processor interrupts. An example is a multi-core ARM CPU working alongside with two Cortex-M micro controllers. >From the user point of view there is usually an operating system booting on the Master processor (e.g. Linux) at platform startup, while the other processors are used to offload the Master one from some computation or to deal with real-time interfaces. It is the Master OS that triggers the boot of the Slave processors, and provides them also the binary code to execute (e.g. RTOS, binary firmware) by placing it into a pre-defined memory area that is accessible to the Slaves. Usually the memory for the Slaves is carved out from the Master OS during boot. Once a Slave is booted the two processors can communicate through queues in shared memory and inter-processor interrupts (IPIs). In Linux, it is the remoteproc/rpmsg framework that enables the control (boot/shutdown) of Slave processors, and also to establish a communication channel based on virtio queues. Currently, QEMU is not able to model such an architecture mainly because only a single processor can be emulated at one time, and the OS binary image needs to be placed in memory at model startup. This patch series adds a set of modules and introduces minimal changes to the current QEMU code-base to implement what described above, with master and slave implemented as two different instances of QEMU. The aim of this work is to enable application and runtime programmers to test their AMP applications, or their new inter-SoC communtication protocol. The main changes are depicted in the following diagram and involve: - A new multi-client socket implementation that allows multiple instances of QEMU to attach to the same socket, with only one acting as a master. - A new memory backend, the shared memory backend, based on the file memory backend. Such new backend enables, on the master side, to allocate the whole memory as shareable (e.g. /dev/shm, or hugetlbfs). On the slave side it enables the startup of QEMU without any main memory allocated. The the slave goes in a waiting state, the same used in the case of an incoming migration, and a callback is registered on a multi-client socket shared with the master. The waiting state ends when the master sends to the slave the file descriptor and offset to mmap and use as memory. - A new inter-processor interrupt hardware distribution module, that is used also to trigger the boot of slave processors. Such module uses a pair of eventfd for each master-slave couple to trigger interrupts between the instances. No slave-to-slave interrupts are envisioned by the current implementation. The multi client-socket is used for the master to trigger the boot of a slave, and also for each master-slave couple to exchancge the eventd file descriptors. The IDM device can be instantiated either as a PCI or sysbus device. Memory (e.g. hugetlbfs) +------------------+ +--------------+ +------------------+ | | | | | | | QEMU MASTER | | Master | | QEMU SLAVE | | | | Memory | | | | +------+ +------+-+ | | +-+------+ +------+ | | | | |SHMEM | | | |SHMEM | | | | | | VCPU | |Backend +-----> | +----->Backend | | VCPU | | | | | | | | | | +---> | | | | | +--^---+ +------+-+ | | | | +-+------+ +--^---+ | | | | | | | | | | | | +--+ | | | | | | +---+ | | | IRQ | | +----------+ | | | | IRQ | | | | | | | | | | | | | | | +----+----+ | | | Slave <------+ | | +----+---+ | +--+ IDM +-----+ | | Memory | | | +---+ IDM +-----+ +-^----^--+ | | | | | +-^---^--+ | | | +----------+ | | | | | | +--------------+ | | | | | | | | | +--------------------------------------+-----------+ | | UNIX Domain Socket(send mem fd + offset, trigger boot) | | | +-----------------------------------------------------------+ eventfd The whole code can be checked out from: https://git.virtualopensystems.com/dev/qemu-het.git branch: qemu-het-rfc-v1 Patches apply to the current QEMU master branch ========= Demo ========= This patch series comes in the form of a demo to better understand how the changes introduced can be exploited. At the current status the demo can be executed using an ARM target for both master and slave. The demo shows how a master QEMU instance carves out the memory for a slave, copies inside linux kernel image and device tree blob and finally triggers the boot. How to reproduce the demo: In order to reproduce the demo a couple more extra elements need to be downloaded and compiled. Binary loader Loads the slave firmware (kernel) binary into memory and triggers the boot https://git.virtualopensystems.com/dev/qemu-het-tools.git branch: load-bin-boot To compile: just type "make" Slave kernel Compile a linux kernel image (zImage) for the virt machine model. IDM test kernel module Needed to trigger the boot of a slave https://git.virtualopensystems.com/dev/qemu-het-tools.git branch: IDM-kernel-module To compile: KDIR=kernel_path ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make Slave DTB https://git.virtualopensystems.com/dev/qemu-het-tools.git branch: slave-dtb Copy binary loader, IDM kernel module, zImage and dtb inside the disk image or ramdisk of the master instance. Run the demo: run the master instance ./arm-softmmu/qemu-system-arm \ -kernel zImage \ -M virt -cpu cortex-a15 \ -drive if=none,file=disk.img,cache=writeback,id=foo1 \ -device virtio-blk-device,drive=foo1 \ -object multi-socket-backend,id=foo,listen,path=ms_socket \ -object memory-backend-shared,id=mem,size=1G,mem-path=/mnt/hugetlbfs,chardev=foo,master=on,prealloc=on \ -device idm_ipi,master=true,memdev=mem,socket=foo \ -numa node,memdev=mem -m 1G \ -append "root=/dev/vda rw console=ttyAMA0 mem=512M memmap=512M$0x60000000" \ -nographic run the slave instance ./arm-softmmu/qemu-system-arm\ -M virt -cpu cortex-a15 -machine slave=on \ -drive if=none,file=disk.img,cache=writeback,id=foo1 \ -device virtio-blk-device,drive=foo1 \ -object multi-socket-backend,id=foo,path=ms_socket \ -object memory-backend-shared,id=mem,size=512M,mem-path=/mnt/hugetlbfs,chardev=foo,master=off \ -device idm_ipi,master=false,memdev=mem,socket=foo \ -incoming "shared:mem" -numa node,memdev=mem -m 512M \ -nographic For simplicity, use a disk image for the slave instead of a ramdisk. As visible from the kernel boot arguments, the master is booted with mem=512 so that one half of the whole memory allocated is not used by the master and reserved for the slave. Such memory starts for the virt platform from address 0x60000000. Once the master is booted the image of the kernel and DTB can be copied in the memory carved out for the slave. In the maser console probe the IDM kernel module: $ insmod idm_test_mod.ko run the application that copies the binaries into memory and triggers the boot: $ ./load_bin_app 1 ./zImage ./slave.dtb On the slave console the linux kernel boot should be visible. The present demo is intended only as a demonstration to see the patch-set at work. In the near future, boot triggering, memory carveout and binary copy might be implemented in a remoteproc driver coupled with a RPMSG driver for communication between master and slave instance. This work has been sponsored by Huawei Technologies Duesseldorf GmbH. Baptiste Reynal (3): backend: multi-socket backend: shared memory backend migration: add shared migration type Christian Pinto (5): hw/misc: IDM Device hw/arm: sysbus-fdt qemu: slave machine flag hw/arm: boot qemu: numa backends/Makefile.objs | 4 +- backends/hostmem-shared.c | 203 ++++++++++++++++++ backends/multi-socket.c | 353 +++++++++++++++++++++++++++++++ default-configs/arm-softmmu.mak | 1 + default-configs/i386-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak | 1 + hw/arm/boot.c | 13 ++ hw/arm/sysbus-fdt.c | 60 ++++++ hw/core/machine.c | 27 +++ hw/misc/Makefile.objs | 2 + hw/misc/idm.c | 416 +++++++++++++++++++++++++++++++++++++ include/hw/boards.h | 2 + include/hw/misc/idm.h | 119 +++++++++++ include/migration/migration.h | 2 + include/qemu/multi-socket.h | 124 +++++++++++ include/sysemu/hostmem-shared.h | 61 ++++++ migration/Makefile.objs | 2 +- migration/migration.c | 2 + migration/shared.c | 32 +++ numa.c | 17 +- qemu-options.hx | 5 +- util/qemu-config.c | 5 + 22 files changed, 1448 insertions(+), 4 deletions(-) create mode 100644 backends/hostmem-shared.c create mode 100644 backends/multi-socket.c create mode 100644 hw/misc/idm.c create mode 100644 include/hw/misc/idm.h create mode 100644 include/qemu/multi-socket.h create mode 100644 include/sysemu/hostmem-shared.h create mode 100644 migration/shared.c -- 1.9.1