On 1/22/2016 6:38 PM, Tetsuya Mukawa wrote: > On 2016/01/22 17:14, Xie, Huawei wrote: >> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote: >>> virtio: Extend virtio-net PMD to support container environment >>> >>> The patch adds a new virtio-net PMD configuration that allows the PMD to >>> work on host as if the PMD is in VM. >>> Here is new configuration for virtio-net PMD. >>> - CONFIG_RTE_LIBRTE_VIRTIO_HOST_MODE >>> To use this mode, EAL needs physically contiguous memory. To allocate >>> such memory, add "--shm" option to application command line. >>> >>> To prepare virtio-net device on host, the users need to invoke QEMU >>> process in special qtest mode. This mode is mainly used for testing QEMU >>> devices from outer process. In this mode, no guest runs. >>> Here is QEMU command line. >>> >>> $ qemu-system-x86_64 \ >>> -machine pc-i440fx-1.4,accel=qtest \ >>> -display none -qtest-log /dev/null \ >>> -qtest unix:/tmp/socket,server \ >>> -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1\ >>> -device virtio-net-pci,netdev=net0,mq=on \ >>> -chardev socket,id=chr1,path=/tmp/ivshmem,server \ >>> -device ivshmem,size=1G,chardev=chr1,vectors=1 >>> >>> * QEMU process is needed per port. >> Does qtest supports hot plug virtio-net pci device, so that we could run >> one QEMU process in host, which provisions the virtio-net virtual >> devices for the container? > Theoretically, we can use hot plug in some cases. > But I guess we have 3 concerns here. > > 1. Security. > If we share QEMU process between multiple DPDK applications, this QEMU > process will have all fds of the applications on different containers. > In some cases, it will be security concern. > So, I guess we need to support current 1:1 configuration at least. > > 2. shared memory. > Currently, QEMU and DPDK application will map shared memory using same > virtual address. > So if multiple DPDK application connects to one QEMU process, each DPDK > application should have different address for shared memory. I guess > this will be a big limitation. > > 3. PCI bridge. > So far, QEMU has one PCI bridge, so we can connect almost 10 PCI devices > to QEMU. > (I forget correct number, but it's almost 10, because some slots are > reserved by QEMU) > A DPDK application needs both virtio-net and ivshmem device, so I guess > almost 5 DPDK applications can connect to one QEMU process, so far. > To add more PCI bridges solves this. > But we need to add a lot of implementation to support cascaded PCI > bridges and PCI devices. > (Also we need to solve above "2nd" concern.) > > Anyway, if we use virtio-net PMD and vhost-user PMD, QEMU process will > not do anything after initialization. > (QEMU will try to read a qtest socket, then be stopped because there is > no message after initialization) > So I guess we can ignore overhead of these QEMU processes. > If someone cannot ignore it, I guess this is the one of cases that it's > nice to use your light weight container implementation.
Thanks for the explanation, and also in your opinion where is the best place to run the QEMU instance? If we run QEMU instances in host, for vhost-kernel support, we could get rid of the root privilege issue. Another issue is do you plan to support multiple virtio devices in container? Currently i find the code assuming only one virtio-net device in QEMU, right? Btw, i have read most of your qtest code. No obvious issues found so far but quite a couple of nits. You must have spent a lot of time on this. It is great work! > Thanks, > Tetsuya >