On 07/27/2015 11:54 AM, Yang Hongyang wrote: > > > On 07/27/2015 11:24 AM, Jason Wang wrote: >> >> >> On 07/24/2015 04:04 PM, Yang Hongyang wrote: >>> Hi Jason, >>> >>> On 07/24/2015 10:12 AM, Jason Wang wrote: >>>> >>>> >>>> On 07/24/2015 10:04 AM, Dong, Eddie wrote: >>>>> Hi Stefan: >>>>> Thanks for your comments! >>>>> >>>>>> On Mon, Jul 20, 2015 at 02:42:33PM +0800, Li Zhijian wrote: >>>>>>> We are planning to implement colo-proxy in qemu to cache and >>>>>>> compare >>>>>> packets. >>>>>> >>>>>> I thought there is a kernel module to do that? >>>>> Yes, that is the previous solution the COLO sub-community choose >>>>> to go, but we realized it might be not the best choices, and thus we >>>>> want to bring discussion back here :) More comments are welcome. >>>>> >>>> >>>> Hi: >>>> >>>> Could you pls describe more details on this decision? What's the >>>> reason >>>> that you realize it was not the best choice? >>> >>> Below is my opinion: >>> >>> We realized that there're disadvantages do it in kernel spaces: >>> 1. We need to recompile kernel: the colo-proxy kernel module is >>> implemented as a nf conntrack extension. Adding a extension need to >>> modify the extension struct in-kernel, so recompile kernel is >>> needed. >> >> There's no need to do all in kernel, you can use a separate process to >> do the comparing and trigger the state sync through monitor. > > I don't get it, colo-proxy kernel module using a kthread do the > comparing and > trigger the state sync. We implemented it as a nf conntrack extension > module, > so we need to extend the extension struct in-kernel, although it just > needs > few lines changes to kernel, but a recompile of kernel is needed. Are you > talking about not implement it as a nf conntrack extension?
Yes, I mean implement the comparing in userspace but not in qemu. > >> >>> 2. We need to recompile iptables/nftables to use together with the >>> colo-proxy >>> kernel module. >>> 3. Need to configure primary host to forward input packets to >>> secondary as >>> well as configure secondary to forward output packets to primary >>> host, the >>> network topology and configuration is too complex for a regular >>> user. >>> >> >> You can use current kernel primitives to mirror the traffic of both PVM >> and SVM to another process without any modification of kernel. And qemu >> can offload all network configuration to management in this case. And >> what's more import, this works for vhost. Filtering in qemu won't work >> for vhost. > > We are using tc to mirror/forward packets now. Implement in QEMU do > have some > limits, but there're also limits in kernel, if the packet do not pass > the host kernel TCP/IP stack, such as vhost-user. But the limits are much less than userspace, no? For vhost-user, maybe we could extend the backed to mirror the traffic also. > >> >> >>> You can refer to http://wiki.qemu.org/Features/COLO >>> to see the network topology and the steps to setup an env. >> >> The figure "COLO Framework" shows there's a proxy kernel module in >> primary node but in secondary node this is done through a process? This >> will complicate the environment a bit more. > > proxy kernel module also works for secondary node. > >> >>> >>> Setup a test env is too complex. The usability is so important to a >>> feature >>> like COLO which provide VM FT solution, if fewer people can/willing to >>> setup the env, the feature is useless. So we decide to develop user >>> space >>> colo-proxy. >> >> If the setup is too complex, need to consider to simplify or reuse codes >> and designs. Otherwise you probably introduce something new that needs >> fault tolerance. >> >>> >>> The advantage is obvious, >>> 1. we do not need to recompile kernel. >>> 2. No need to recompile iptables/nftables. >> >> As I descried above, looks like there's no need to modify kernel. >> >>> 3. we do not need to deal with the network configuration, we just >>> using a >>> socket connection between 2 QEMUs to forward packets. >> >> All network configurations should be offloaded to management. And you >> still need a dedicated topology according to the wiki. >> >>> 4. A complete VM FT solution in one go, we have already developed the >>> block >>> replication in QEMU, so with the network replication in QEMU, all >>> components we needed are within QEMU, this is very important, it >>> greatly >>> improves the usability of COLO feature! We hope it will gain more >>> testers, >>> users and developers. >> >> Is your block solution works for vhost? > > No, it can't works for vhost and dataplane, migration also won't work > for dataplane IIRC. > >> >>> 5. QEMU will gain a complete VM FT solution and the most advantage FT >>> solution >>> so far! >>> >>> Overall, usability is the most important factor that impact our choice. >>> >>> >> >> Usability will be improved if you can use exist primitives and decouple >> unnecessary codes from qemu. >> >> Thanks >> >>>> >>>> Thanks >>>> . >>>> >>> >> >> >> . >> >