On 11/10/2015 04:30 PM, zhanghailiang wrote: > On 2015/11/10 15:35, Jason Wang wrote: >> >> >> On 11/10/2015 01:26 PM, Tkid wrote: >>> Hi,all >>> >>> We are planning to reimplement colo proxy in userspace (Here is in >>> qemu) to >>> cache and compare net packets.This module is one of the important >>> components >>> of COLO project and now it is still in early stage, so any comments and >>> feedback are warmly welcomed,thanks in advance. >>> >>> ## Background >>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop >>> Service) >>> project is a high availability solution. Both Primary VM (PVM) and >>> Secondary VM >>> (SVM) run in parallel. They receive the same request from client, and >>> generate >>> responses in parallel too. If the response packets from PVM and SVM are >>> identical, they are released immediately. Otherwise, a VM checkpoint >>> (on demand) >>> is conducted. >>> Paper: >>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0 >>> COLO on Xen: >>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping >>> COLO on Qemu/KVM: >>> http://wiki.qemu.org/Features/COLO >>> >>> By the needs of capturing response packets from PVM and SVM and >>> finding out >>> whether they are identical, we introduce a new module to qemu >>> networking called >>> colo-proxy. >>> >>> This document describes the design of the colo-proxy module >>> >>> ## Glossary >>> PVM - Primary VM, which provides services to clients. >>> SVM - Secondary VM, a hot standby and replication of PVM. >>> PN - Primary Node, the host which PVM runs on >>> SN - Secondary Node, the host which SVM runs on >>> >>> ## Our Idea ## >>> >>> COLO-Proxy >>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a >>> plugin for >>> qemu net filter.the function keep SVM connect normal to PVM and compare >>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint. >>> >>> == Workflow == >>> >>> >>> +--+ +--+ >>> |PN| |SN| >>> +-----------------------+ +-----------------------+ >>> | +-------------------+ | | +-------------------+ | >>> | | | | | | | | >>> | | PVM | | | | SVM | | >>> | | | | | | | | >>> | +--+-^--------------+ | | +-------------^----++ | >>> | | | | | | | | >>> | | | +------------+ | | +-----------+ | | | >>> | | | | COLO | | (socket) | | COLO | | | | >>> | | | | CheckPoint +---------------------> CheckPoint| | | | >>> | | | | | | (6) | | | | | | >>> | | | +-----^------+ | | +-----------+ | | | >>> | | | (5) | | | | | | >>> | | | | | | | | | >>> | +--v-+--------------+ | Forward(socket) | +-------------+----v+ | >>> | |COLO Proxy | +-------+(1)+--------->seq&ack adjust(2)| | | >>> | | +-----+------+ | | +-----------------+ | | >>> | | | Compare(4) <-------+(3)+---------+ COLO Proxy | | >>> | +-------------------+ | Forward(socket) | +-------------------+ | >>> ++Qemu+-----------------+ ++Qemu+-----------------+ >>> | ^ >>> | | >>> | | >>> +--------v-+--------+ >>> | | >>> | Client | >>> | | >>> +-------------------+ >>> >>> >>> >>> >>> (1)When PN receive client packets,PN COLO-Proxy copy and forward >>> packets to >>> SN COLO-Proxy. >>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's >>> ack,send >>> adjusted packets to SVM >>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu >>> COLO-Proxy. >>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's >>> packets,then >>> compare PVM's packets data with SVM's packets data. If packets is >>> different, compare >>> module notify COLO CheckPoint module to do a checkpoint then send >>> PVM's packets to >>> client and drop SVM's packets, otherwise, just send PVM's packets to >>> client and >>> drop SVM's packets. >>> (5)notify COLO-Checkpoint module checkpoint is needed >>> (6)Do COLO-Checkpoint >>> >>> ### QEMU space TCP/IP stack(Based on SLIRP) ### >>> We need a QEMU space TCP/IP stack to help us to analysis packet. After >>> looking >>> into QEMU, we found that SLIRP >>> >>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29 >>> >>> >>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within >>> QEMU, it can >>> help use to handle the packet written to/read from backend(tap) device >>> which is >>> just like a link layer(L2) packet. >>> >>> ### Packet enqueue and compare ### >>> Together with QEMU space TCP/IP stack, we enqueue all packets sent by >>> PVM and >>> SVM on Primary QEMU, and then compare the packet payload for each >>> connection. >>> >> >> Hi: >> >> Just have the following questions in my mind (some has been raised in >> the previous rounds of discussion without a conclusion): >> >> - What's the plan for management layer? The setup seems complicated so >> we could not simply depend on user to do each step. (And for security >> reason, qemu was usually run as unprivileged user) > > We will do most of the setup works automatically in qemu as possible > as we can. > Compared with kernel proxy scheme, it is not a big deal. :)
Yes, but what I mean is for host. E.g setting up network like bridge or others. > >> - What's the plan for vhost? Userspace network in qemu is rather slow, >> most user will choose vhost. >> - What if application generate packet based on hwrng device? This will >> produce always different packets. > > Yes, that is really a big problem, actually, we have discussed it for > many > times, it seems that there is no perfect way to solve it. :( > We have a compromise approach, when we find there are too many continuous > checkpoint requests, we switch COLO from normal mode to periodic mode > which SVM will stop running. > (Dave have realized this before, which called Hybrid mode. The patches is > "[RFC/COLO: 0/3] Hybrid mode and parameterisation") Aha, I see. Thanks for the pointer. > >> - Not sure SLIRP is perfect matched for this task. As has been raised, >> another method is to decouple the packet comparing from qemu. In this >> way, lots of open source userspace stack could be used. > > Hmm, it seems to be a good idea, maybe we can add a checkpoint request > command > in COLO to support more packets comparing scheme ... > Yes, then when to synchronize could be determined by external program. (Just an idea FYI). > Thanks, > zhanghailiang > >> - Haven't read the code of packet comparing, but if it needs to keep >> track the state of each connection, it could be easily DOS from guest. >> >> Thanks >> >> . >> > > >