> -Original Message-
> From: Jianfeng Tan [mailto:jianfeng.tan at intel.com]
> Sent: Friday, November 06, 2015 2:31 AM
> To: dev at dpdk.org
> Cc: mst at redhat.com; mukawa at igel.co.jp; nakajima.yoshihiro at
> lab.ntt.co.jp;
> michael.qiu at intel.com; Guohongzhen; Zhoujingbin; Zhuangyanying; Zhangbo
> (Oscar); gaoxiaoqiu; Zhbzg; huawei.xie at intel.com; Jianfeng Tan
> Subject: [RFC 0/5] virtio support for container
>
> This patchset only acts as a PoC to request the community for comments.
>
> This patchset is to provide high performance networking interface
> (virtio) for container-based DPDK applications. The way of starting DPDK
> applications in containers with ownership of NIC devices exclusively is beyond
> the scope. The basic idea here is to present a new virtual device (named
> eth_cvio), which can be discovered and initialized in container-based DPDK
> applications rte_eal_init().
> To minimize the change, we reuse already-existing virtio frontend driver code
> (driver/net/virtio/).
>
> Compared to QEMU/VM case, virtio device framework (translates I/O port r/w
> operations into unix socket/cuse protocol, which is originally provided in
> QEMU),
> is integrated in virtio frontend driver. Aka, this new converged driver
> actually
> plays the role of original frontend driver and the role of QEMU device
> framework.
>
> The biggest difference here lies in how to calculate relative address for
> backend.
> The principle of virtio is that: based on one or multiple shared memory
> segments, vhost maintains a reference system with the base addresses and
> length of these segments so that an address from VM comes (usually GPA,
> Guest Physical Address), vhost can translate it into self-recognizable address
> (aka VVA, Vhost Virtual Address). To decrease the overhead of address
> translation, we should maintain as few segments as better. In the context of
> virtual machines, GPA is always locally continuous. So it's a good choice. In
> container's case, CVA (Container Virtual Address) can be used. This means
> that:
> a. when set_base_addr, CVA address is used; b. when preparing RX's
> descriptors, CVA address is used; c. when transmitting packets, CVA is filled
> in
> TX's descriptors; d. in TX and CQ's header, CVA is used.
>
> How to share memory? In VM's case, qemu always shares all physical layout to
> backend. But it's not feasible for a container, as a process, to share all
> virtual
> memory regions to backend. So only specified virtual memory regions (type is
> shared) are sent to backend. It leads to a limitation that only addresses in
> these areas can be used to transmit or receive packets. For now, the shared
> memory is created in /dev/shm using shm_open() in the memory initialization
> process.
>
> How to use?
>
> a. Apply the patch of virtio for container. We need two copies of patched code
> (referred as dpdk-app/ and dpdk-vhost/)
>
> b. To compile container apps:
> $: cd dpdk-app
> $: vim config/common_linuxapp (uncomment "CONFIG_RTE_VIRTIO_VDEV=y")
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
>
> c. To build a docker image using Dockerfile below.
> $: cat ./Dockerfile
> FROM ubuntu:latest
> WORKDIR /usr/src/dpdk
> COPY . /usr/src/dpdk
> CMD ["/usr/src/dpdk/examples/l2fwd/build/l2fwd", "-c", "0xc", "-n", "4",
> "--no-huge", "--no-pci",
> "--vdev=eth_cvio0,queue_num=256,rx=1,tx=1,cq=0,path=/var/run/usvhost",
> "--", "-p", "0x1"]
> $: docker build -t dpdk-app-l2fwd .
>
> d. To compile vhost:
> $: cd dpdk-vhost
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
>
> e. Start vhost-switch
> $: ./examples/vhost/build/vhost-switch -c 3 -n 4 --socket-mem 1024,1024 -- -p
> 0x1 --stats 1
>
> f. Start docker
> $: docker run -i -t -v :/var/run/usvhost
> dpdk-app-l2fwd
>
> Signed-off-by: Huawei Xie
> Signed-off-by: Jianfeng Tan
>
> Jianfeng Tan (5):
> virtio/container: add handler for ioport rd/wr
> virtio/container: add a new virtual device named eth_cvio
> virtio/container: unify desc->addr assignment
> virtio/container: adjust memory initialization process
> vhost/container: change mode of vhost listening socket
>
> config/common_linuxapp | 5 +
> drivers/net/virtio/Makefile