On 11/10/2015 05:41 PM, Dr. David Alan Gilbert wrote: > * Jason Wang (jasow...@redhat.com) wrote: >> >> On 11/10/2015 01:26 PM, Tkid wrote: >>> Hi,all >>> >>> We are planning to reimplement colo proxy in userspace (Here is in >>> qemu) to >>> cache and compare net packets.This module is one of the important >>> components >>> of COLO project and now it is still in early stage, so any comments and >>> feedback are warmly welcomed,thanks in advance. >>> >>> ## Background >>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop >>> Service) >>> project is a high availability solution. Both Primary VM (PVM) and >>> Secondary VM >>> (SVM) run in parallel. They receive the same request from client, and >>> generate >>> responses in parallel too. If the response packets from PVM and SVM are >>> identical, they are released immediately. Otherwise, a VM checkpoint >>> (on demand) >>> is conducted. >>> Paper: >>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0 >>> COLO on Xen: >>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping >>> COLO on Qemu/KVM: >>> http://wiki.qemu.org/Features/COLO >>> >>> By the needs of capturing response packets from PVM and SVM and >>> finding out >>> whether they are identical, we introduce a new module to qemu >>> networking called >>> colo-proxy. >>> >>> This document describes the design of the colo-proxy module >>> >>> ## Glossary >>> PVM - Primary VM, which provides services to clients. >>> SVM - Secondary VM, a hot standby and replication of PVM. >>> PN - Primary Node, the host which PVM runs on >>> SN - Secondary Node, the host which SVM runs on >>> >>> ## Our Idea ## >>> >>> COLO-Proxy >>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a >>> plugin for >>> qemu net filter.the function keep SVM connect normal to PVM and compare >>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint. >>> >>> == Workflow == >>> >>> >>> +--+ +--+ >>> |PN| |SN| >>> +-----------------------+ +-----------------------+ >>> | +-------------------+ | | +-------------------+ | >>> | | | | | | | | >>> | | PVM | | | | SVM | | >>> | | | | | | | | >>> | +--+-^--------------+ | | +-------------^----++ | >>> | | | | | | | | >>> | | | +------------+ | | +-----------+ | | | >>> | | | | COLO | | (socket) | | COLO | | | | >>> | | | | CheckPoint +---------------------> CheckPoint| | | | >>> | | | | | | (6) | | | | | | >>> | | | +-----^------+ | | +-----------+ | | | >>> | | | (5) | | | | | | >>> | | | | | | | | | >>> | +--v-+--------------+ | Forward(socket) | +-------------+----v+ | >>> | |COLO Proxy | +-------+(1)+--------->seq&ack adjust(2)| | | >>> | | +-----+------+ | | +-----------------+ | | >>> | | | Compare(4) <-------+(3)+---------+ COLO Proxy | | >>> | +-------------------+ | Forward(socket) | +-------------------+ | >>> ++Qemu+-----------------+ ++Qemu+-----------------+ >>> | ^ >>> | | >>> | | >>> +--------v-+--------+ >>> | | >>> | Client | >>> | | >>> +-------------------+ >>> >>> >>> >>> >>> (1)When PN receive client packets,PN COLO-Proxy copy and forward >>> packets to >>> SN COLO-Proxy. >>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send >>> adjusted packets to SVM >>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu >>> COLO-Proxy. >>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then >>> compare PVM's packets data with SVM's packets data. If packets is >>> different, compare >>> module notify COLO CheckPoint module to do a checkpoint then send >>> PVM's packets to >>> client and drop SVM's packets, otherwise, just send PVM's packets to >>> client and >>> drop SVM's packets. >>> (5)notify COLO-Checkpoint module checkpoint is needed >>> (6)Do COLO-Checkpoint >>> >>> ### QEMU space TCP/IP stack(Based on SLIRP) ### >>> We need a QEMU space TCP/IP stack to help us to analysis packet. After >>> looking >>> into QEMU, we found that SLIRP >>> >>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29 >>> >>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within >>> QEMU, it can >>> help use to handle the packet written to/read from backend(tap) device >>> which is >>> just like a link layer(L2) packet. >>> >>> ### Packet enqueue and compare ### >>> Together with QEMU space TCP/IP stack, we enqueue all packets sent by >>> PVM and >>> SVM on Primary QEMU, and then compare the packet payload for each >>> connection. >>> >> Hi: >> >> Just have the following questions in my mind (some has been raised in >> the previous rounds of discussion without a conclusion): >> >> - What's the plan for management layer? The setup seems complicated so >> we could not simply depend on user to do each step. (And for security >> reason, qemu was usually run as unprivileged user) > It's certainly easier than the current COLO code that relies on a very > complex set of bridges, extra network interfaces and kernel modules. > UMU (cc'd) have been working on a libvirt set that starts COLO up, although > one bit that's very messy is the curretn kernel based network comparison > code.
Ok. >> - What's the plan for vhost? Userspace network in qemu is rather slow, >> most user will choose vhost. >> - What if application generate packet based on hwrng device? This will >> produce always different packets. > Yes, there are cases this happens - COLO's worst case is similar to simple > checkpointing (because it has a limit to the smallest checkpoint), but it's > best case is much better, on a compute heavy load, it ends up taking > a checkpoint very rarely. > Actually the big problem is where randomness occurs in unexpected places, > e.g. where things like Perl's hash randomisation means that the two > hosts produce the same data in different orders. Not familiar with this, but unlike the hwrng, if the random data was computed by software, after a synchronization, it still has the possibility to produce the same result for a while. > >> - Not sure SLIRP is perfect matched for this task. As has been raised, >> another method is to decouple the packet comparing from qemu. In this >> way, lots of open source userspace stack could be used. >> - Haven't read the code of packet comparing, but if it needs to keep >> track the state of each connection, it could be easily DOS from guest. > The guest can only break it's own networking; so shooting itself in the foot > is no big deal. > > Dave The question is for the packet comparing, if the number of connections in guest exceed the maximum connections it could track, what will it do? > >> Thanks > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK >