Coming back to this after pondering more about QoS (e.g. DSCP) marking and SFC, to what extent they are related to multiple ACL stages.
There are multiple features either already implemented or being discussed, which lend themselves to arbitrary match criteria (e.g. inport, outport, eth.type, eth.src, eth.dst, ip4.src, ip4.dst, ip6.src, ip6.dst, ip.dscp, tcp.src, tcp.dst, udp.src, udp.dst) with feature-specific actions, including: - ACLs for security purposes, often security groups but not restricted to that. http://openvswitch.org/pipermail/dev/2016-July/076674.html proposed expanding this to two or more stages, in order to support multiple security features such as security groups and OpenStack FWaaS, or security groups and network ACLs. - QoS marking. http://openvswitch.org/pipermail/dev/2016-August/078702.html proposed port-based DSCP marking. It was suggested that this could be expanded to arbitrary match criteria. Physical switches typically support DSCP (and CoS) marking with arbitrary match criteria, similar to but often with different syntax from ACLs. - Service Function Chaining (SFC) insertion, where traffic is directed into the start of a service chain. http://openvswitch.org/pipermail/dev/2016-July/075035.html proposed that the rules for SFC insertion (called flow classification in that thread) be set as ACL rules. For similar reasons to those mentioned below with regard to two different security features, IMO QoS marking and SFC insertion should each have their own pipeline stage. By separating each feature into its own pipeline stage, the amount of code and complexity related to interactions between different features is minimized. For the most part, each feature can be implemented independently. The big question in my mind is whether: - All of these should be bundled under the existing ACL table with the addition of a "stage" column as proposed in http://openvswitch.org/pipermail/dev/2016-July/076674.html and extensions to the action column. or - Each of these features should be represented separately in the Northbound DB, along the lines of "Logical_Switch": { "columns": { "name": {"type": "string"}, "ports": {"type": {"key": {"type": "uuid", "refTable": "Logical_Switch_Port", "refType": "strong"}, "min": 0, "max": "unlimited"}}, "<feature1>s": {"type": {"key": {"type": "uuid", "refTable": "<feature1>", "refType": "strong"}, "min": 0, "max": "unlimited"}}, + "<feature2>s": {"type": {"key": {"type": "uuid", + "refTable": "<feature2>", + "refType": "strong"}, + "min": 0, + "max": "unlimited"}}, "load_balancer": {"type": {"key": {"type": "uuid", ... "<feature1>": { "columns": { "priority": {"type": {"key": {"type": "integer", "minInteger": 0, "maxInteger": 32767}}}, "direction": {"type": {"key": {"type": "string", "enum": ["set", ["from-lport", "to-lport"]]}}}, "match": {"type": "string"}, "action": {"type": {"key": {"type": "string", "enum": ["set", ["allow", "allow-related", "drop", "reject"]]}}}, "external_ids": { "type": {"key": "string", "value": "string", "min": 0, "max": "unlimited"}}}, "isRoot": false}, + "<feature2>": { + "columns": { + "priority": {"type": {"key": {"type": "integer", + "minInteger": 0, + "maxInteger": 32767}}}, + "direction": {"type": {"key": {"type": "string", + "enum": ["set", ["from-lport", "to-lport"]]}}}, + "match": {"type": "string"}, + "action": {"type": {"key": {"type": "string", + "enum": ["set", ["dscp", "cos"]]}, + "value": {"type": "integer", + "minInteger": 0, + "maxInteger": 63}}}, + "external_ids": { + "type": {"key": "string", "value": "string", + "min": 0, "max": "unlimited"}}}, + "isRoot": false}, where <feature1>, <feature2>, etc would be "ACL", "QoS marking", "SFC insertion", etc. The advantages of the second approach include: - The actions can be tailored to each feature, limiting the corresponding stage(s) to only the actions relevant to that feature, with proper action syntax. - Feature specific processing can easily be added to ovn-northd.c. The pros and cons of the first approach include: - More compact, with less churn of the Northbound DB as features are added. However, it will not be as immediately apparent how to provision the different features, by adding ACL rules to the appropriate ACL stage. - The ACL "action" syntax becomes a superset of all possible actions across all stages. This could be abused, with users mixing different actions from different features in the same ACL stage. While that is not the goal of this proposal, it is not immediately apparent what we should do in the presence of such misconfiguration. - How do you add feature-specific processing to the different ACL stages? - For security ACLs, currently there are: - Priority 34000 flows to allow DHCP replies from ovn-controller - if (has_stateful), some priority-65535 flows and some priority-1 flows, regardless of the ACL rules specified by users in the northbound DB - if (has_stateful), 3 different flows for each ACL rule, adding different combinations of ct.new, ct.est, ct.rpl, and ct_label[0] to the rule-specific match criteria - Security specific mapping from the northbound "allow", "allow-related", "drop" and "reject" actions to "next;" and "drop;" in lflows. - For QoS marking: - Initially it makes sense to just have stateless rules, mapping each rule to one lflow. - If deep packet inspection (DPI) capability is added in the future, looking beyond the L4 header in order to determine the application, there might be a reason to add some stateful capability to QoS marking. However, my guess is that this would be done indirectly. DPI would determine the application ID in earlier pipeline stages. The QoS marking stage could just match on the application ID, without having to worry about stateful behavior directly. - For SFC insertion: - Initially it makes sense to just have stateless rules, mapping each rule to one lflow. - It would be possible to add stateful behavior to SFC insertion in the future. If symmetric service chains are defined, then based on stateful behavior, ct.rpl traffic could be directed into the reverse chain based on a ct_label value set in the initiating direction. For example, this could direct traffic back to a load balancer, even if the load balancer left the original source IP address unchanged in the initiating direction. - It seems like much of the feature-specific processing could be handled by changing the existing "has_stateful" to "has_stateful_security", then adding something like "has_security" to trigger the priority 34000 flows to allow DHCP replies. However, I don't have any good answer at the moment for stateful SFC insertion. That really does seem to require stage-specific processing. Does this make sense to people? Any strong preference between the first (ACLs with multiple stages) and second (separate feature definitions in NB DB) approaches? Mickey On Sun, Aug 14, 2016 at 3:21 PM, Mickey Spiegel <mickeys....@gmail.com> wrote: > On Sat, Aug 13, 2016 at 10:02 PM, Ben Pfaff <b...@ovn.org> wrote: > >> On Fri, Jul 29, 2016 at 05:28:26PM +0000, Mickey Spiegel wrote: >> > Could you expand on why priorities in a single stage aren't enough to >> > satisfy the use case? >> > >> > <Mickey> >> > If two features are configured independently with a mix of >> > prioritized allow and drop rules, then with a single stage, a >> > new set of ACL rules must be produced that achieves the same >> > behavior. This is sometimes referred to as an "ACL merge" >> > algorithm, for example: >> > http://www.cisco.com/en/US/products/hw/switches/ps708/produc >> ts_white_paper09186a00800c9470.shtml#wp39514 >> > >> > In the worst case, for example when the features act on different >> > packet fields (e.g. one on IP address and another on L4 port), >> > the number of rules required can approach >> > (# of ACL1 rules) * (# of ACL2 rules). >> > >> > While it is possible to code up such an algorithm, it adds >> > significant complexity and complicates whichever layer >> > implements the merge algorithm, either OVN or the CMS above. >> > >> > By using multiple independent pipeline stages, all of this >> > software complexity is avoided, achieving the proper result >> > in a simple and straightforward manner. >> > >> > Recent network hardware ASICs tend to have around 8 or 10 ACL >> > stages, though they tend to evaluate these in parallel given >> > all the emphasis on low latency these days. >> >> I guess that, in software, if there's a need for 2 of something, there's >> usually a need for N of it, so I'd tend to prefer that instead of >> hard-coding 2 stages of ACLs, we make N of them available (for perhaps N >> == 8), especially given that you say hardware tends to work that way. >> It's not really more expensive for OVS, and definitely not if only a few >> of them are used. We might need to expand the number of logical tables, >> since currently there are only 16 ingress tables and 16 egress tables, >> but doubling them to 32 each wouldn't be a big deal. >> > > I did try to code the core part of the changes so that more ACL stages > could be easily added in the future, but the code having to do with > definition of the pipeline stages, associated functions, and nbctl is only > coded for 2 stages at the moment. Let me think about the best way to > generalize this. > > As far as need and usage, I guess the key question is whether features > such as service function chaining and QoS marking will use generic ACL > stages, or pipeline stages specifically coded for those features? > In hardware switches, those type of features use many of the multiple > ACL stages. > > The way I coded the patch, the fixed rules allowing and dropping > certain flows regardless of user-defined ACL rules are duplicated in > each ACL stage. However, I am not sure if those rules are necessary > or make sense if the actions for that pipeline stage are redirect (for SFC) > or QoS marking, rather than allow and drop. I need to think about it. > > I have moved on to other things temporarily, will come back to this patch > if/when I have time to work on ACL tests, or if someone else adds ACL > tests. > > Mickey > > >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> http://openvswitch.org/mailman/listinfo/dev >> > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev