On Thu, Sep 17, 2015 at 04:04:23PM -0700, Joe Stringer wrote: > This patch adds a new action and fields to OVS that allow connection > tracking to be performed. This support works in conjunction with the > Linux kernel support merged into the Linux-4.3 development cycle. > > The new 'ct' action sends this flow through the connection tracker, > which accepts the following parameters initally: > > - "commit": Connections that execute ct(commit) should be remembered, > allowing future packets in the connection to be recognized as part of > an "established" (est) connection, packets in the reply (rpl) > direction, or related to an existing connection (rel). > - "zone=[u16|NXM]": Perform connection tracking in the zone specified. > Each zone is an independent connection tracking context. If the > "commit" parameter is used, the connection will only be remembered in > the specified zone, and not in other zones. This is 0 by default. > - "table=NUMBER": This makes the conntrack action act like the "output" > action, with a copy of the packet sent to the connection tracker. > When the connection tracker has finished processing, it may return > and resume pipeline processing in the specified table, with connection > tracking fields populated (initially ct_state and ct_zone). This table > is strongly recommended to be greater than the current table, to > prevent loops.
Wow, this is big! Thanks for doing all this work! The description of "table" above confuses me a little. The important description, though, is in ovs-ofctl(8), so I'll review that description in more detail. > Below is a simple example flow table to allow outbound TCP traffic from > port 1 and drop traffic from port 2 that was not initiated by port 1: > > ovs-ofctl del-flows br0 > ovs-ofctl add-flow br0 "arp,action=normal" > ovs-ofctl add-flow br0 \ > "in_port=1,conn_state=-trk,tcp,action=ct(commit,zone=9),2" > ovs-ofctl add-flow br0 \ > "in_port=2,conn_state=-trk,tcp,action=ct(table=1,zone=9)" > ovs-ofctl add-flow br0 \ > "table=1,in_port=2,conn_state=+trk+est,tcp,action=1" > ovs-ofctl add-flow br0 \ > "table=1,in_port=2,conn_state=+trk+new,tcp,action=drop" I think that the above example needs s/conn_state/ct_state/. Here in NEWS, let's make a mental note to revise this when DPDK-based connection tracking gets in: > + - Add support for connection tracking through the new "ct" action > + and "ct_state"/"ct_zone" match fields. Only available on Linux kernels > + with the connection tracking module loaded. In openvswitch.h, are the OVS_CS_F_* flags the same as other flags already exported by Linux somewhere and, if so, would the Linux kernel people like it better if we #defined them directly to those values (or used the existing names directly)? Obviously we'd have to work around that for non-Linux machines. The design of OVS_CT_ATTR_FLAGS as having a single flag bit is one approach, but another that is sometimes taken for Netlink is to have the presence of an attribute act as a flag. For example, one could have an OVS_CT_ATTR_COMMIT attribute whose presence signals that the flow should be committed. It's worrisome that there's no way to slow-path a flow that includes a conntrack action. I guess the consequences aren't too bad though since the common case of needing userspace help just redirects the packet back to the kernel. The third statement here seems to follow from the first two. Did you mean something stronger like 'If the "Invalid" bit is set then only the "Tracked" bit is also set.'? + * - If "Tracked" is unset, no other bits may be set. + * - If "Tracked" is set, one or more other bits may be set. + * - The "Invalid" bit is only ever set with the "Tracked" bit. I don't think the checks for nonzero masks are needed in nx_put_raw(), since nxm_put_16m() will eventually check for itself. For ovs_conntrack_policy[] in odp-util.c, there's no need to give a min_len in either case since the defaults are correct. In parse_conntrack_action(), I think that the n < 0 test here is always true: + while (s != end) { + int n = -1; + + s += strspn(s, delimiters); + if (ovs_scan(s, "commit%n", &n)) { + flags |= OVS_CT_F_COMMIT; + s += n; + continue; + } + if (ovs_scan(s, "zone=%"SCNu16"%n", &zone, &n)) { + s += n; + continue; + } + + if (n < 0) { + return -EINVAL; + } + } In format_odp_key_attr() under OVS_KEY_ATTR_CT_STATE, I think that format_flags_masked() covers the whole case. (Are you sure you want ',' for the delimiter to format_flags()? That makes it look like multiple fields, probably should use | as format_flags_masked() passes to format_flags(). Also, parse_flags() looks for |.) In format_odp_key_attr() under OVS_KEY_ATTR_CT_ZONE, please output an 0x prefix for hexadecimal output, to make it obvious that it's hex. In ofp-actions.c, I don't think anything enforces prerequisites for 'zone_src'. In format_CT() in ofp-actions.c, you can simplify a little by always appending a comma and then snipping the last one, e.g.: static void format_alg(int port, struct ds *s) { if (port == IPPORT_FTP) { ds_put_format(s, "alg=ftp,"); } else if (port) { ds_put_format(s, "alg=%d,", port); } } static void format_CT(const struct ofpact_conntrack *a, struct ds *s) { ds_put_cstr(s, "ct("); if (a->flags & NX_CT_F_COMMIT) { ds_put_cstr(s, "commit,"); } if (a->recirc_table != NX_CT_RECIRC_NONE) { ds_put_format(s, "table=%"PRIu8",", a->recirc_table); } format_alg(a->alg, s); if (a->zone_src.field) { ds_put_format(s, "zone="); mf_format_subfield(&a->zone_src, s); ds_put_char(s, ','); } else if (a->zone_imm) { ds_put_format(s, "zone=%"PRIu16, a->zone_imm); ds_put_char(s, ','); } format_alg(a->alg, s); if (ofpact_ct_get_action_len(a)) { ds_put_cstr(s, "exec("); ofpacts_format(a->actions, ofpact_ct_get_action_len(a), s); ds_put_cstr(s, "),"); } ds_chomp(s, ','); ds_put_char(s, ')'); } I think that struct ofpact_conntrack doesn't need to be defined with a pad member here. I guess it's done that way because it becomes important in the next patch, but it's confusing here. In ofproto-dpif-xlate.c, why does compose_conntrack_action() throw away half of the zone (via "& 0xff")? Acked-by: Ben Pfaff <b...@nicira.com> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev