On 01/20/2016 06:19 PM, Jason Wang wrote: > > > On 01/20/2016 06:01 PM, Wen Congyang wrote: >> On 01/20/2016 02:54 PM, Jason Wang wrote: >>> >>> On 01/20/2016 11:29 AM, Zhang Chen wrote: >>>>> Sure. >>>>> >>>>> Two main comments/suggestions: >>>>> >>>>> - TCP analysis is missed in current version, maybe you point a git tree >>>>> (or another version of RFC) to me for a better understanding of the >>>>> design. (Just a skeleton for TCP should be sufficient to discuss). >>>>> - I prefer to make the code as reusable as possible. So it's better to >>>>> split/decouple the reusable parts from the codes. So a vague idea is: >>>>> >>>>> 1) Decouple the packet comparing from the netfilter. You've achieved >>>>> this 99% since the work has been done in a thread. Just let the thread >>>>> poll sockets directly, then the comparing have the possibility to be >>>>> reused by other kinds of dataplane. >>>>> 2) Implement traffic mirror/redirector as filter. >>>>> 3) Implement TCP seq rewriting as a filter. >>>>> >>>>> Then, in primary node, you need just a traffic mirror, which did: >>>>> - mirror ingress traffic to secondary node >>>>> - mirror outgress traffic to packet comparing thread >>>>> >>>>> And in secondadry node, you need two filters: >>>>> - A TCP seq rewriter which adjust tcp sequence number. >>>>> - A traffic redirector which redirect packet from a socket as ingress >>>>> traffic, and redirect outgress traffic to the socket which could be >>>>> polled by remote packet comparing thread. >>>>> Thoughts? >>>>> >>>>> Thanks >>>>> >>>>>> Thanks >>>>>> zhangchen >>>> >>>> Hi, Jason. >>>> We consider your suggestion to split/decouple >>>> the reusable parts from the codes. >>>> Due to filter plugin are traversed one by one in order >>>> we will split colo-proxy to three filters in each side. >>>> >>>> But in this plan,primary and secondary both have socket >>>> server,startup is a problem. >>> I believe this issue could be solved by reusing socket chardev. >>> >>>> >>>> Primary qemu >>>> Secondary qemu >>>> +----------------------------------------------------------+ >>>> +-----------------------------------------------------------+ >>>> | +-----------------------------------------------------+ | | >>>> +------------------------------------------------------+ | >>>> | | | | | >>>> | | | >>>> | | guest | | | >>>> | guest | | >>>> | | | | | >>>> | | | >>>> | +-----------^--------------+--------------------------+ | | >>>> +---------------------+--------+-----------------------+ | >>>> | | | | >>>> | ^ | | >>>> | | | | >>>> | | | | >>>> | +-------------------------------------------------+ >>>> | | | | >>>> | netfilter | | | | | >>>> netfilter | | | >>>> | +-----------------------------------------------------+ | | | >>>> +------------------------------------------------------+ | >>>> | | | | filter excute order | | | | >>>> | | | filter excute order | | >>>> | | | | +-------------------> | | | | >>>> | | | +-------------------> | | >>>> | | | | | | | | >>>> | | | TCP | | >>>> | | +---------+-+ +------v-----+ +----+ +-----+ | | | | >>>> | +-----------+ +---+----+---v+rewriter+ +--------+ | | >>>> | | | | | | | | | | | | >>>> | | | | | | | | | | >>>> | | | mirror | | redirect +----> compare | | | >>>> +--------> mirror +---> adjust | adjust +-->redirect| | | >>>> | | | client | | server | | | | | | >>>> | | server | | ack | seq | |client | | | >>>> | | | | | | | | | | | >>>> | | | | | | | | | | >>>> | | +----^------+ +----^-------+ +-----+------+ | | | >>>> | +-----------+ +--------+-------------+ +----+---+ | | >>>> | | | tx | rx | rx | | | >>>> | tx all | rx | | >>>> | +-----------------------------------------------------+ | | >>>> +------------------------------------------------------+ | >>>> | | >>>> +-------------------------------------------------------------------------------------------+ >>>> >>>> | >>>> | | | | >>>> | | >>>> +----------------------------------------------------------+ >>>> +-----------------------------------------------------------+ >>>> | | >>>> |guest receive |guest send >>>> | | >>>> +--------+------------------------------------v------------+ >>>> | | >>>> | | >>>> | tap >>>> | NOTE: filter direction is rx/tx/all >>>> | >>>> | rx:receive packets sent to the netdev >>>> | >>>> | tx:receive packets sent by the netdev >>>> +----------------------------------------------------------+ >>>> >>>> >>>> >>> I still like to decouple comparer from netfilter. It have two obvious >>> advantages: >>> >>> - make it can be reused by other dataplane (e.g vhost) >>> - secondary redirector could redirect rx to comparer on primary node >>> directly which simplify the design. >>> >>>> >>>> >>>> >>>> guest recv packet route >>>> >>>> primary >>>> tap --> mirror client filter >>>> mirror client will send packet to guest,at the >>>> same time, copy and forward packet to secondary >>>> mirror server. >>>> >>>> secondary >>>> mirror server filter --> TCP rewriter >>>> if recv packet is TCP packet,we will adjust ack >>>> and update TCP checksum, then send to secondary >>>> guest. else directly send to guest. >>>> >>>> >>>> guest send packet route >>>> >>>> primary >>>> guest --> redirect server filter >>>> redirect server filter recv primary guest packet >>>> but do nothing, just pass to next filter. >>>> >>>> redirect server filter --> compare filter >>>> compare filter recv primary guest packet then >>>> waiting scondary redirect packet to compare it. >>>> if packet same,send primary packet and clear secondary >>>> packet, else send primary packet and do >>>> checkpoint. >>>> >>>> secondary >>>> guest --> TCP rewriter filter >>>> if the packet is TCP packet,we will adjust seq >>>> and update TCP checksum. then send it to >>>> redirect client filter. else directly send to >>>> redirect client filter. >>>> >>>> redirect client filter --> redirect server filter >>>> forward packet to primary >>>> >>>> >>>> In failover scene(primary is down), the TCP rewriter will keep >>>> servicing >>>> for the TCP connection which is established after the last checkpoint。 >>>> >>>> >>>> >>>> How about this plan? >>> Sounds good. >>> >>> And there's indeed no need to differ client/server by reusing the socket >>> chardev. E.g: >>> >>> In primary node: >>> >>> ... >>> -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait >>> -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait >>> -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait >>> -netdev tap,id=hn0 >>> -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0 >>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1 >> Why mirrorer has indev? > > > As I said in the previous mails. I would like to decouple packet > comparing from netfilter. You've already done most of this since the > comparing is done in an independent thread. So the indev here is to > mirror the packet sent by guest to the packet comparing thread. > >> I think we can use traffic-redirector to do it. >> The command line is: >> -netdev tap,id=hn0 >> -object traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0 >> -object traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0 >> -colo-comparer >> primary_traffic=comparer0,secondary_traffic=comparer1,netdev=hn0 >> In the comparer thread, we can use qemu_net_queue_send_iov() to send >> out the packet. >> >> Also, we can merge the socketdev comparer1 and mirrorer0. > > It depends on whether or not packet comparing was done in a net filter > (which I prefer not).
I mean that: packet comapring is done in a thread, not a net filter. The flow of the packet sent from guest: 1. traffice-redirecotr, we will redirector the packet to comparer0, the next filter will never see it. 2. comparing thread: read it from socket chardev comparer0 3. call qemu_net_queue_send_iov() to send it back to the netdev. Thanks Wen Congyang > >> >> Thanks >> Wen Congyang >> >>> ... >>> >>> packet comparer compares the packets from two chardev: comparer0 and >>> comparer1. >>> traffic-mirrorer mirror tx to secondary node through chardev mirrorer0, >>> and mirror rx to packet comparer through chardev comparer0. >>> >>> In secondary node: >>> >>> ... >>> -chardev socket,id=redirector0,host=ip_primary,port=Y >>> -chardev socket,id=redirector1,host=ip_primary,port=Z >>> -netdev tap,id=hn0 >>> -traffic-redirector netdev=hn0,id,r0,indev=redirector0,outdev=redirector1 >>> -colo-rewriter netdev=hn0,id=c0 >>> ... >>> >>> traffic-redirector redirect the rx traffic from primary node through >>> redirector0 and redirect the tx traffic to promary node through redirector1. >>> colo-rewriter rewrite seq number as a normal netfilter. >>> >>> >>> >>>> >>>>> . >>>>> >>> >>> >>> >>> . >>> >> >> > > > > . >