On 8 February 2016 at 22:53, Han Zhou <zhou...@gmail.com> wrote: > On Fri, Feb 5, 2016 at 1:30 PM, Russell Bryant <russ...@ovn.org> wrote: >> >> On 02/05/2016 02:22 AM, Justin Pettit wrote: >> > Joe and I spent some time today discussing our options. This is >> > pretty tricky to get right and most of the options that come >> > immediately to mind have subtle corner cases. We're planning to >> > whiteboard more options tomorrow, but I wanted to get down what's my >> > personal favorite and see what people think of its shortcoming. >> > We're planning to document the other options that we've considered >> > and the problems that they have, which we'll share with the group. >> > >> > The idea is to essentially implement a mark and sweep algorithm. >> > Assuming that we have a lowest priority "drop" flow, we'll add an >> > action that sets a "drop_flow" bit (e.g., 0x1) in the conntrack >> > label. In the next table, we'll have a flow that matches on this >> > label bit and drops traffic. Here's a psuedo set of flows to >> > implement allowing stateful traffic to port 22 and 80: >> > >> > 1) table=0, ip, actions=ct(table=1) >> > 2) table=1, priority=10, ct_state=-rpl, tcp, tp_dst=22, > actions=ct(commit,table=2) >> > 3) table=1, priority=10, ct_state=-rpl, tcp, tp_dst=80, > actions=ct(commit,table=2) >> > 4) table=1, priority=0, ct_state=-rpl, actions=ct(set_ct_label=0x1),drop >> > 5) table=1, priority=10, ct_state=+rpl, ct_label=0x1, actions=drop >> > 6) table=1, priority=0, ct_state=+rpl+est, actions=goto_table:2 >> > 7) table=2, priority=0, actions= /* Continue logical forwarding > pipeline. */ >> > >> > Here's an explanation of the flows: >> > >> > 1) Send all IP traffic to the connection tracker and then go to >> > table 1. >> > 2) If the destination TCP port is 22 in the request direction, commit >> > it to the connection tracker and continue to table 2. >> > 3) Same as flow 2, but with TCP port 80 traffic. >> > 4) Traffic in the request direction that doesn't match flows 2 or 3 >> > get the conntrack label set to 0x1 (the "drop_flow" bit) and the >> > traffic gets dropped. It's important to note that there's no >> > "commit" here, so that this will mark an existing conntrack entry >> > with that label, but won't create a new entry for it. >> > 5) Drop traffic in the reply direction with the "drop_flow" bit set. >> > 6) Send any reply traffic that has an existing conntrack entry (and >> > the "drop_flow" bit not set) to table 2. >> > 7) Continue the logical forwarding pipeline (ie, the ACL allowed the > traffic) >> > >> > If traffic is initiated to port 23, it will be dropped by flow 4, but >> > there won't be an entry in the conntrack table since no one committed >> > it. If traffic is initiated to port 22, the connection will be >> > allowed and committed to the conntrack table by flow 2. Similarly >> > for traffic initiated to port 80, it will be allowed and committed by >> > flow 3. The reply direction traffic to 22 and 80 will be allowed by >> > flow 6. >> > >> > Now let's say that flow 2 is removed because we don't want to allow >> > port 22 traffic anymore. There will still be a conntrack entry from >> > that previous connection. Now when the initiator sends traffic to >> > port 22, it will get dropped by flow 4, but we'll also set the >> > existing conntrack entry's flow label to 0x1. When the reply traffic >> > comes back, it will now match flow 5, since the ct_label value will >> > be 0x1 and the flow will be dropped. Traffic to port 80 will be >> > unaffected. >> > >> > The nice thing about this approach is that it's not very heavy duty: >> > it doesn't cause a lot of flow churn, it doesn't make worse >> > megaflows, it doesn't cause race conditions between updating the OVS >> > flow table and conntrack entries, we don't have to write (and debug) >> > another flow classifier in ovn-controller, it's straight-forward to >> > implement, and it's instantaneous in application--mostly. >> > >> > That "mostly" is it's drawback, though. It instantly corrects >> > traffic in both directions once a packet is sent in the initiating >> > direction. However, until that happens, reply traffic will continue >> > to flow. I doubt this will be a big problem in practice, since you'd >> > need to have traffic that is largely unidirectional without any sort >> > of acknowledgement. ACKs would take care of this for TCP, so it >> > wouldn't be a real problem (there could be a few packets that are let >> > through, but policy updates aren't going to be instantaneous coming >> > down from the CMS, anyway). There could be UDP-based protocols that >> > don't use any sort of positive acknowledgement, but I don't know of >> > any off the top of my head. >> > >> > As I mentioned, Joe and I will try to come up with a document that >> > describes the different approaches that occur to us along with their >> > strengths and weaknesses. I think that will be helpful to have a >> > more fruitful discussion about alternatives. >> > >> > In the meantime, I'd be curious to hear what people think about the >> > above proposal. In the meantime, I think this would be a reasonable >> > approach, since it covers most of the use-cases nicely and it >> > wouldn't be hard to implement. >> >> Thank you for the write-up! This approach sounds great to me. Some >> small questions... >> >> 1) If we're only using 1 bit for now, is there any reason to use >> ct_label over ct_mark? The docs in ovs-ofctl(8) seem to suggest they're >> identical other than being 32-bit vs 128-bit. Would using the 32-bit >> ct_mark be beneficial in any way instead? >> >> 2) One thing not explicitly addressed in this write-up is traffic marked >> as related. I think the proposal means just adding a match on >> ct_label=0x1 where we match ct_state=+rel today and we just rely on a >> packet in the request direction of the main connection to set ct_label. >> That seems fine, but I wanted to clarify that point. >> >> I'm happy to work on the OVN implementation of this approach assuming no >> alternative supplants it. It sounds fun. :-) >> >> -- >> Russell Bryant >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev > > This looks nice! I have one more question on top of Russell's. > In this proposal, every packet in request direction will trigger a "commit" > in conntrack. Just want to confirm is there performance impact? > Would it be better to split the flow to 2 flows, e.g.: > >> > 2) table=1, priority=10, ct_state=-rpl, tcp, tp_dst=22, > actions=ct(commit,table=2) > > change to: > > 2.1) table=1, priority=10, ct_state=+new, tcp, tp_dst=22, > actions=ct(commit,table=2) > 2.2) table=1, priority=10, ct_state=-rpl-new, tcp, tp_dst=22, > actions=goto_table:2
Note that "ct()" without the commit implies a lookup, statistics attribution in conntrack, maybe some state machine transitions and timer refreshes so it's not quite that simple, you'd at least need ct(table=2). The difference between ct(commit,table=2) and ct(table=2) is negligible, so not something that controller writers should worry about. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev