Hi, all We are planning to implement colo-proxy in qemu to cache and compare packets. This module is one of the important component of COLO project and now it is still in early stage, so any comments and feedback are warmly welcomed, thanks in advance.
## Background COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service) project is a high availability solution. Both Primary VM (PVM) and Secondary VM (SVM) run in parallel. They receive the same request from client, and generate responses in parallel too. If the response packets from PVM and SVM are identical, they are released immediately. Otherwise, a VM checkpoint (on demand) is conducted. Paper: http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0 COLO on Xen: http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping COLO on Qemu/KVM: http://wiki.qemu.org/Features/COLO By the needs of capturing response packets from PVM and SVM and finding out whether they are identical, we introduce a new module to qemu networking called colo-proxy. This document describes the design of the colo-proxy module ## Glossary PVM - Primary VM, which provides services to clients. SVM - Secondary VM, a hot standby and replication of PVM. PN - Primary Node, the host which PVM runs on SN - Secondary Node, the host which SVM runs on ## Workflow ## The following image show the qemu networking packet datapath between guest's NIC and qemu's backend in colo-proxy. +---+ +---+ |PN | |SN | +---+--------------------------+ +------------------------------+ | +-------+ | | +-------+ | +--------+ |chkpoint<--------[socket]------->chkpoint +--------+ |PVM | +---^---+ | | +---+---+ |SVM | | | +proxy--v--------+ | | | | | | | | | | | | | | | +---+ | | +TCP/IP stack+ | | | +-----v-------proxy | +---+ | +-|NIC|--+ | | | | | | | | +-|NIC|--+ | +^-++ | | +--------+ | | | | | +TCP/IP stack-+ | +^--+ | | | +------> | | compare| | <-[socket]-forward- | +--------+ | | | | | | | | +---+----+ | | | | | | |seq&ack | | <----+ | | | | +-----|------+ | | | | | |adjust | | | | | | | | | | | | | +--------+ | | | | +-----------<+>-----copy&forward-[socket]---> +-------------+ | | | +---|---|--------+ | | +------------^----+ | | | | | | | | | | | | | x | | +--+---v----+ | | +-v---------+ | | QEMU | backend | | | QEMU | backend | | +------------+ (tap) +-----+ +------------+ (tap) +-----+ +-----------+ +-----------+ ## Our Idea ## ### Net filter In current QEMU, a packet is transported between networking backend(tap) and qemu network adapter(NIC) directly. Backend and adapter is linked by NetClientState->peer in qemu as following +----------------------------------------+ v | +NetClientState+ +------->+NetClientState+ | |info->type=TAP| | |info->type=NIC| | +--------------+ | +--------------+ | | *peer +---+ | *peer +----+ +--------------+ +--------------+ |name="tap0" | |name="e1000" | +--------------+ +--------------+ | ... | | ... | +--------------+ +--------------+ In COLO QEMU, we insert a net filter named colo-proxy between backend and adapter like below: typedef struct COLOState { NetClientState nc; NetClientState *peer; } COLOState; +------->+NetClientState+ +NetClientState+<--------+ | |info->type=TAP| |info->type=NIC| | | +--------------+ +--------------+ | +-----------+ *peer | | *peer +------------+ | | +--------------+ +--------------+ | | | | |name="tap0" | |name="e1000" | | | | | +--------------+ +--------------+ | | | | | ... | | ... | | | | | +--------------+ +--------------+ | | | | | | | | +-COLOState------------+ +-COLOState------------+ | | +--------->+NetClientState+<- - + +---->+NetClientState+<---------+ | | |info->type=COLO | | | | |info->type=COLO | | | | +--------------+ | | | | +--------------+ | | +-------+ *peer | | | | | | *peer +-------+ | +--------------+ | | | | +--------------+ | | |name="colo1" | | | | | |name="colo2" | | | +--------------+ | | | | +--------------+ | | | | | | | | +--------------+---------+ | +--------------+ | | | *peer | | | | | *peer | | | +--------------+ | +---------+--------------+ | +----------------------+ +----------------------+ After we insert colo-proxy filter, all packets will pass by this filter and more important thing is that we can analysis packet by ourselves. ### QEMU space TCP/IP stack(re-use SLIRP) ### We need a QEMU space TCP/IP stack to help us to analysis packet. After looking into QEMU, we found that SLIRP http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29 is a good choice for us. SLIRP proivdes a full TCP/IP stack within QEMU, it can help use to handle the packet written to/read from backend(tap) device which is just like a link layer(L2) packet. ### packet enqueue and compare ### Together with QEMU space TCP/IP stack, we enqueue all packets sent by PVM and SVM on Primary QEMU, and then compare the packet payload for each connection. ### Net filter Usage ### On both Primary/Secondary host, invoke QEMU with the following parameters to insert a net filter(colo-proxy): "-netdev tap,id=hn0 -device e1000,netdev=hn0 \ -netdev colo,id=colo,backend=hn0"