Introduce a new logical port type called "localnet". A logical port with this type also has an option called "network_name". A "localnet" logical port represents a connection to a network that is locally accessible from each chassis running ovn-controller. ovn-controller will use the ovn-bridge-mappings configuration to figure out which patch port on br-int should be used for this port.
OpenStack Neutron has an API extension called "provider networks" which allows an administrator to specify that it would like ports directly attached to some pre-existing network in their environment. There was a previous thread where we got into the details of this here: http://openvswitch.org/pipermail/dev/2015-June/056765.html The case where this would be used is an environment that isn't actually interested in virtual networks and just wants all of their compute resources connected up to externally managed networks. Even in this environment, OVN still has a lot of value to add. OVN implements port security and ACLs for all ports connected to these networks. OVN also provides the configuration interface and control plane to manage this across many hypervisors. As a specific example, consider an environment with two hypvervisors (A and B) with two VMs on each hypervisor (A1, A2, B1, B2). Now imagine that the desired setup from an OpenStack perspective is to have all of these VMs attached to the same provider network, which is a physical network we'll refer to as "physnet1". The first step here is to configure each hypervisor with bridge mappings that tell ovn-controller that a local bridge called "br-eth1" is used to reach the network called "physnet1". We can simulate the inital setup of this environment in ovs-sandbox with the following commands: # Setup the local hypervisor (A) ovs-vsctl add-br br-eth1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth1 # Create a fake remote hypervisor (B) ovn-sbctl chassis-add fakechassis geneve 127.0.0.1 To get the behavior we want, we model every Neutron port connected to a Neutron provider network as an OVN logical switch with 2 ports. The first port is a normal logical port to be used by the VM. The second logical port is a special port with its type set to "localnet". You could imagine an alternative configuration where there are many OVN logical ports with a single OVN "localnet" logical port on the same OVN logical switch. This setup provides something different, where the logical ports would communicate with eath other in logical space via tunnnels between hypervisors. For Neutron's use case, we want all ports communicating via an existing network without the use of an overlay. To simulate the creation of the OVN logical switches and OVN logical ports for A1, A2, B1, and B2, you can run the following commands: # Create 4 OVN logical switches. Each logical switch has 2 ports, # port1 for a VM and physnet1 for the existing network we are # connecting to. for n in 1 2 3 4; do ovn-nbctl lswitch-add provnet1-$n ovn-nbctl lport-add provnet1-$n provnet1-$n-port1 ovn-nbctl lport-set-macs provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-set-port-security provnet1-$n-port1 00:00:00:00:00:0$n ovn-nbctl lport-add provnet1-$n provnet1-$n-physnet1 ovn-nbctl lport-set-macs provnet1-$n-physnet1 unknown ovn-nbctl lport-set-type provnet1-$n-physnet1 localnet ovn-nbctl lport-set-options provnet1-$n-physnet1 network_name=physnet1 done # Bind lport1 (A1) and lport2 (A2) to the local hypervisor. ovs-vsctl add-port br-int lport1 -- set Interface lport1 external_ids:iface-id=provnet1-1-port1 ovs-vsctl add-port br-int lport2 -- set Interface lport2 external_ids:iface-id=provnet1-2-port1 # Bind the other 2 ports to the fake remote hypervisor. ovn-sbctl lport-bind provnet1-3-port1 fakechassis ovn-sbctl lport-bind provnet1-4-port1 fakechassis After running these commands, we have the following logical configuration: $ ovn-nbctl show lswitch 035645fc-b2ff-4e26-b953-69addba80a9a (provnet1-4) lport provnet1-4-physnet1 macs: unknown lport provnet1-4-port1 macs: 00:00:00:00:00:04 lswitch 66212a85-b3b6-4688-bcf6-8062941a2d96 (provnet1-2) lport provnet1-2-physnet1 macs: unknown lport provnet1-2-port1 macs: 00:00:00:00:00:02 lswitch fc5b1141-0216-4fa7-86f3-461811c1fc9b (provnet1-3) lport provnet1-3-physnet1 macs: unknown lport provnet1-3-port1 macs: 00:00:00:00:00:03 lswitch 9b1d2636-e654-4d43-84e8-a921af611b33 (provnet1-1) lport provnet1-1-physnet1 macs: unknown lport provnet1-1-port1 macs: 00:00:00:00:00:01 We can also look at OVN_Southbound to see that 2 logical ports are bound to each hypervisor: $ ovn-sbctl show Chassis "56b18105-5706-46ef-80c4-ff20979ab068" Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-1-port1" Port_Binding "provnet1-2-port1" Chassis fakechassis Encap geneve ip: "127.0.0.1" Port_Binding "provnet1-3-port1" Port_Binding "provnet1-4-port1" Now we can generate several packets to test how a packet would be processed on hypervisor A. The OpenFlow port numbers in this demo are: 1 - patch port to br-eth1 (physnet1) 2 - tunnel to fakechassis 3 - lport1 (A1) 4 - lport2 (A2) Packet test #1: A1 to A2 - This will be output to ofport 1. Despite both VMs being local to this hypervisor, all packets betwen the VMs go through physnet1. In practice, this will get optimized at br-eth1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #2: physnet1 to A2 - Consider this a continuation of test is attached to will be considered. The end result should be that the only output is to ofport 4 (A2). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02 -generate Packet test #3: A1 to B1 - This will be output to ofport 1, as physnet1 is to be used to reach any other port. When it arrives at hypervisor B, processing would look just like test #2. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:03 -generate Packet test #4: A1 broadcast. - Again, the packet will only be sent to physnet1. ovs-appctl ofproto/trace br-int \ in_port=3,dl_src=00:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff -generate Packet test #5: B1 broadcast arriving at hypervisor A. This is somewhat a continuation of test #4. When a broadcast packet arrives from physnet1 on hypervisor A, we should see it output to both A1 and A2 (ofports 3 and 4). ovs-appctl ofproto/trace br-int \ in_port=1,dl_src=00:00:00:00:00:03,dl_dst=ff:ff:ff:ff:ff:ff -generate Signed-off-by: Russell Bryant <rbry...@redhat.com> --- ovn/controller/lflow.h | 10 +++- ovn/controller/physical.c | 124 +++++++++++++++++++++++++++++++++++++++------ ovn/ovn-architecture.7.xml | 27 ++++++++++ ovn/ovn-nb.xml | 16 ++++-- ovn/ovn-sb.xml | 31 ++++++++++-- 5 files changed, 184 insertions(+), 24 deletions(-) diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h index 5cac76c..02e36ff 100644 --- a/ovn/controller/lflow.h +++ b/ovn/controller/lflow.h @@ -58,6 +58,7 @@ struct uuid; * These values are documented in ovn-architecture(7), please update the * documentation if you change any of them. */ #define MFF_LOG_DATAPATH MFF_METADATA /* Logical datapath (64 bits). */ +#define MFF_OVN_FLAGS MFF_REG5 /* Bit flags used internally by OVN */ #define MFF_LOG_INPORT MFF_REG6 /* Logical input port (32 bits). */ #define MFF_LOG_OUTPORT MFF_REG7 /* Logical output port (32 bits). */ @@ -69,8 +70,13 @@ struct uuid; MFF_LOG_REG(MFF_REG1) \ MFF_LOG_REG(MFF_REG2) \ MFF_LOG_REG(MFF_REG3) \ - MFF_LOG_REG(MFF_REG4) \ - MFF_LOG_REG(MFF_REG5) + MFF_LOG_REG(MFF_REG4) + +/* Bits used in MFF_OVN_FLAGS. */ +enum { + /* Indicates that the packet came in on a localnet port */ + OVN_FLAG_LOCALNET = (1 << 0), +}; void lflow_init(void); void lflow_run(struct controller_ctx *, struct hmap *flow_table); diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index 2ec0ba9..e43b989 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -23,7 +23,9 @@ #include "ovn-controller.h" #include "ovn/lib/ovn-sb-idl.h" #include "openvswitch/vlog.h" +#include "shash.h" #include "simap.h" +#include "smap.h" #include "sset.h" #include "vswitch-idl.h" @@ -138,6 +140,8 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, { struct simap lport_to_ofport = SIMAP_INITIALIZER(&lport_to_ofport); struct hmap tunnels = HMAP_INITIALIZER(&tunnels); + struct simap localnet_to_ofport = SIMAP_INITIALIZER(&localnet_to_ofport); + for (int i = 0; i < br_int->n_ports; i++) { const struct ovsrec_port *port_rec = br_int->ports[i]; if (!strcmp(port_rec->name, br_int->name)) { @@ -150,6 +154,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, continue; } + const char *localnet = smap_get(&port_rec->external_ids, + "ovn-patch-port"); + for (int j = 0; j < port_rec->n_interfaces; j++) { const struct ovsrec_interface *iface_rec = port_rec->interfaces[j]; @@ -162,8 +169,11 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, continue; } - /* Record as chassis or local logical port. */ - if (chassis_id) { + /* Record as patch to local net, chassis, or local logical port. */ + if (!strcmp(iface_rec->type, "patch") && localnet) { + simap_put(&localnet_to_ofport, localnet, ofport); + break; + } else if (chassis_id) { enum chassis_tunnel_type tunnel_type; if (!strcmp(iface_rec->type, "geneve")) { tunnel_type = GENEVE; @@ -196,6 +206,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, struct ofpbuf ofpacts; ofpbuf_init(&ofpacts, 0); + struct localnet_flow { + struct shash_node node; + struct match match; + struct ofpbuf ofpacts; + }; + struct shash localnet_inputs = SHASH_INITIALIZER(&localnet_inputs); + /* Set up flows in table 0 for physical-to-logical translation and in table * 64 for logical-to-physical translation. */ const struct sbrec_port_binding *binding; @@ -210,7 +227,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, int tag = 0; ofp_port_t ofport; - if (binding->parent_port) { + if (!strcmp(binding->type, "localnet")) { + const char *network = smap_get(&binding->options, "network_name"); + if (!network) { + continue; + } + ofport = u16_to_ofp(simap_get(&localnet_to_ofport, network)); + } else if (binding->parent_port) { ofport = u16_to_ofp(simap_get(&lport_to_ofport, binding->parent_port)); if (ofport && binding->tag) { @@ -235,6 +258,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, struct match match; if (!tun) { + struct ofpbuf *local_ofpacts = &ofpacts; + bool add_input_flow = true; + /* Packets that arrive from a vif can belong to a VM or * to a container located inside that VM. Packets that * arrive from containers have a tag (vlan) associated with them. @@ -245,33 +271,65 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, * * Priority 150 is for traffic belonging to containers. For such * traffic, match on the tags and then strip the tag. - * Priority 100 is for traffic belonging to VMs. + * Priority 100 is for traffic belonging to VMs or locally connected + * networks. * * For both types of traffic: set MFF_LOG_INPORT to the logical * input port, MFF_LOG_DATAPATH to the logical datapath, and * resubmit into the logical ingress pipeline starting at table * 16. */ - match_init_catchall(&match); - ofpbuf_clear(&ofpacts); - match_set_in_port(&match, ofport); - if (tag) { - match_set_dl_vlan(&match, htons(tag)); + if (!strcmp(binding->type, "localnet")) { + /* The same OpenFlow port may correspond to localnet ports + * attached to more than one logical datapath, so keep track of + * all actions to be taken and add it as a single flow at the + * end. */ + + const char *network = smap_get(&binding->options, "network_name"); + struct shash_node *node; + struct localnet_flow *ln_flow; + + node = shash_find(&localnet_inputs, network); + if (!node) { + ln_flow = xmalloc(sizeof *ln_flow); + match_init_catchall(&ln_flow->match); + match_set_in_port(&ln_flow->match, ofport); + ofpbuf_init(&ln_flow->ofpacts, 0); + /* Set OVN_FLAG_LOCALNET to indicate that the packet came in from a + * localnet port. */ + put_load(OVN_FLAG_LOCALNET, MFF_OVN_FLAGS, 0, 32, + &ln_flow->ofpacts); + + node = shash_add(&localnet_inputs, network, ln_flow); + } + ln_flow = node->data; + local_ofpacts = &ln_flow->ofpacts; + add_input_flow = false; + } else { + ofpbuf_clear(local_ofpacts); + match_init_catchall(&match); + match_set_in_port(&match, ofport); + if (tag) { + match_set_dl_vlan(&match, htons(tag)); + } } /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */ put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64, - &ofpacts); - put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, &ofpacts); + local_ofpacts); + put_load(binding->tunnel_key, MFF_LOG_INPORT, 0, 32, + local_ofpacts); /* Strip vlans. */ if (tag) { - ofpact_put_STRIP_VLAN(&ofpacts); + ofpact_put_STRIP_VLAN(local_ofpacts); } /* Resubmit to first logical ingress pipeline table. */ - put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, &ofpacts); - ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, tag ? 150 : 100, - &match, &ofpacts); + put_resubmit(OFTABLE_LOG_INGRESS_PIPELINE, local_ofpacts); + if (add_input_flow) { + ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, + tag ? 150 : 100, &match, &ofpacts); + } /* Table 33, priority 100. * ======================= @@ -341,10 +399,13 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, match_init_catchall(&match); ofpbuf_clear(&ofpacts); - /* Match MFF_LOG_DATAPATH, MFF_LOG_OUTPORT. */ + /* Match MFF_LOG_DATAPATH, MFF_LOG_OUTPORT. and not + * OVN_FLAG_LOCALNET */ match_set_metadata(&match, htonll(binding->datapath->tunnel_key)); match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0, binding->tunnel_key); + match_set_reg_masked(&match, MFF_OVN_FLAGS - MFF_REG0, + 0, OVN_FLAG_LOCALNET); put_encapsulation(mff_ovn_geneve, tun, binding->datapath, binding->tunnel_key, &ofpacts); @@ -401,6 +462,16 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts); } else if (port->chassis) { sset_add(&remote_chassis, port->chassis->name); + } else if (!strcmp(port->type, "localnet")) { + const char *network = smap_get(&port->options, "network_name"); + if (!network) { + continue; + } + if (!simap_contains(&localnet_to_ofport, network)) { + continue; + } + put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts); + put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts); } } @@ -423,6 +494,9 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, if (!sset_is_empty(&remote_chassis)) { ofpbuf_clear(&ofpacts); + match_set_reg_masked(&match, MFF_OVN_FLAGS - MFF_REG0, + 0, OVN_FLAG_LOCALNET); + const char *chassis; const struct chassis_tunnel *prev = NULL; SSET_FOR_EACH (chassis, &remote_chassis) { @@ -516,4 +590,22 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, free(tun); } hmap_destroy(&tunnels); + + /* Table 0, Priority 100 + * ===================== + * + * We have now determined the full set of actions needed on input + * from a locally accessible network, so we can write the flows for them. + */ + struct shash_node *ln_flow_node, *ln_flow_node_next; + struct localnet_flow *ln_flow; + SHASH_FOR_EACH_SAFE (ln_flow_node, ln_flow_node_next, &localnet_inputs) { + ln_flow = ln_flow_node->data; + shash_delete(&localnet_inputs, ln_flow_node); + ofctrl_add_flow(flow_table, 0, 100, &ln_flow->match, &ln_flow->ofpacts); + ofpbuf_uninit(&ln_flow->ofpacts); + free(ln_flow); + } + shash_destroy(&localnet_inputs); + simap_destroy(&localnet_to_ofport); } diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml index 1d812cf..a09182b 100644 --- a/ovn/ovn-architecture.7.xml +++ b/ovn/ovn-architecture.7.xml @@ -653,6 +653,16 @@ tunnels as part of the tunnel key.) </dd> + <dt>OVN flags</dt> + <dd> + <!-- Keep the following in sync with MFF_OVN_FLAGS in + ovn/controller/lflow.h. --> + Flows may set bits in Nicira extension register number 5 to aid in + processing. Currently, the only flag is a bit that indicates that a + packet arrived via a logical port with a type of <code>localnet</code>. + This field is not passed across tunnels. + </dd> + <dt>VLAN ID</dt> <dd> The VLAN ID is used as an interface between OVN and containers nested @@ -678,6 +688,15 @@ </p> <p> + It's possible that a single ingress physical port maps to multiple + logical ports with a type of <code>localnet</code>. In that case, an + OVN flag is set to indicate that this packet arrived on a <code>localnet</code> + port. This flag is used later to help determine what type of output is + appropriate. The logical datapath and logical input port fields will be + reset and the packet will be resubmitted to table 16 multiple times. + </p> + + <p> Packets that originate from a container nested within a VM are treated in a slightly different way. The originating container can be distinguished based on the VIF-specific VLAN ID, so the @@ -764,6 +783,14 @@ </p> <p> + Note that there is special handling in place to ensure that a packet + that arrived on a <code>localnet</code> logical port is never sent over + a tunnel to a remote hypervisor. This is to prevent loops and + duplicating packets. The only output will be logical ports on the local + hypervisor. + </p> + + <p> Flows in table 33 resemble those in table 32 but for logical ports that reside locally rather than remotely. For unicast logical output ports on the local hypervisor, the actions just resubmit to table 34. For diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml index ade8164..6e20593 100644 --- a/ovn/ovn-nb.xml +++ b/ovn/ovn-nb.xml @@ -116,13 +116,23 @@ </p> <p> - There are no other logical port types implemented yet. + When this column is set to <em>localnet</em>, this logical port represents a + connection to a locally accessible network from each ovn-controller instance. + A logical switch can only have a single <em>localnet</em> port attached. </p> </column> <column name="options"> - This column provides key/value settings specific to the logical port - <ref column="type"/>. + <p> + This column provides key/value settings specific to the logical port + <ref column="type"/>. + </p> + + <p> + When <ref column="type"/> is set to <em>localnet</em>, you must set the option + <em>network_name</em>. ovn-controller uses local configuration to determine + exactly how to connect to this locally accessible network. + </p> </column> <column name="parent_name"> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml index 57e9689..a536833 100644 --- a/ovn/ovn-sb.xml +++ b/ovn/ovn-sb.xml @@ -899,13 +899,38 @@ </p> <p> - There are no other logical port types implemented yet. + When this column is set to <em>localnet</em>, this logical port represents a + connection to a locally accessible network from each ovn-controller instance. + A logical switch can only have a single <em>localnet</em> port attached. </p> </column> <column name="options"> - This column provides key/value settings specific to the logical port - <ref column="type"/>. + <p> + This column provides key/value settings specific to the logical port + <ref column="type"/>. + </p> + + <p> + When <ref column="type"/> is set to <em>localnet</em>, you must set the option + <em>network_name</em>. ovn-controller uses the configuration entry + <em>ovn-bridge-mappings</em> to determine how to connect to this network. + <em>ovn-bridge-mappings</em> is a list of network names mapped to a local + OVS bridge that provides access to that network. An example of configuring + <em>ovn-bridge-mappings</em> would be: + </p> + + <p> + <em>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</em> + </p> + + <p> + Also note that when a logical switch has a <em>localnet</em> port attached, + every chassis that may have a local vif attached to that logical switch + must have a bridge mapping configured to reach that <em>localnet</em>. + Traffic that arrives on a <em>localnet</em> port is never forwarded over a tunnel + to another chassis. + </p> </column> <column name="tunnel_key"> -- 2.4.3 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev