It would really help if you give packet walkthrough when following 2
simultaneous connections happen at the same time.
1. east-west without NAT between the private IP addresses
2  east-west with floating IP.

You also mentioned in the meeting (if I remember correctly) on how you have
to keep a particular interface pinned to a hypervisor. A little description
on that would help.

Also, does this work with overlay networking?



On 22 March 2016 at 14:19, Chandra S Vejendla <csvej...@us.ibm.com> wrote:

> This patch adds distributed floating ip support for ovn. The assumption
> made
> here is that the external network is a single L2 broadcast domain and all
> the
> chassis have connectivity to the external network.
>
> 2 new tables are added in the LROUTER pipeline IN_IP_DNAT & IP_IN_SNAT.
> IN_IP_DNAT will modify the dst ip of the packet from floating ip to vm ip.
> IN_IP_SNAT will modify the src ip of the packet from vm ip to floating ip.
>
> Rules in IN_IP_DNAT:
> - Priority 100 rule to set the reg2 to 0x1 if dst & src networks are
>   connectected via a router and both the networks are private.
> - Priority 90 rule to modify the dst ip from floating ip to vm ip.
> - Priority 0 rule to go to next table.
>
> Rules in IN_IP_SNAT:
> - Priority 100 rule to skip modifying the src ip when reg2 is set to 0x1
> - Priority 90 rule to modify the src ip from vm ip to floating ip and dst
> mac
>   to floating ip port mac if the packet is egressing via the gateway port
> - Priority 50 rule to modify the src ip from vm ip to floating ip
> - Priority 0 rule to go to next table.
>
> Priority 100 rules in IN_IP_DNAT and IN_IP_SNAT serves 2 purposes.
> - Avoid NAT when vms in different LSWITCHES connected via a LROUTER talk to
>   each other using private ips.
> - When 2 VMs connected to the same LSWITCH or different LSWITCHES connected
>   via a router try to talk to each other, the dst ip of the packet should
>   first be DNATed and then the src ip should be SNATed.
>
> The initial design was to stage DNAT in the ingress pipeline and the SNAT
> in
> the egress pipeline, but now both the stages are in the ingress pipeline.
> This
> was done to solve the cases highlighted above [Priority 100 rules]. There
> is a
> need to use information from DNAT stage when SNAT is being processed. This
> would require an explicit register to be burnt to store the information.
>
> Flows modified in the LSWITCH pipeline
>
> Rules in IN_PORT_SEC:
> - Priority 50 rule to allow packets ingressing the LSWITCH router port
>   with a src mac of floating ip port
>
> Rules in ARP_RSP:
> - Priority 150 rule to respond to arp request for floating ip. To prevent
> arp
>   responses for floating ip's from all the chassis, "lport" option is set
> in
>   the external_id's column of the lflow table. lport will point to the
> vif-id of
>   the vm that is associated with the floating ip. When ovn-controller is
>   processing the flows, if it sees an lport option set in the external_ids
>   column, it will install this lflow only if the lport is a local port on
> the
>   chassis.
>
> Rules in L2_LKUP:
> - Priority 50 rule to set the outport to the lrouter port when the dst mac
>   matches the floating ip mac
>
> Rules in OUT_PORT_SEC:
> - Priority 50 rule to allow packet egressing the lrouter port with a mac
> of a
>   floating ip port.
>
> Had to increase MAX_RESUBMIT_RECURSION from 64 to 96. When 2 VMs connected
> via vm1->LS->LR->LS->LR->LS->vm2 are trying to talk to each other, the
> resubmits are exceeding the existing 64 limit.
>
> When a floating ip is associated with a VM ip, NB will set the options of
> the
> floating ip lport to "fixed-ip-port=<lport of vif>, router-port=<lport of
> the
> logical router port".
>
> If you want to try out this patch with openstack, add the following patch
> [1]
> to networking-ovn.
>
> [1] https://review.openstack.org/#/c/295547/
> ---
>  ofproto/ofproto-dpif-xlate.c    |   2 +-
>  ovn/controller/binding.c        |  24 ++-
>  ovn/controller/binding.h        |   4 +-
>  ovn/controller/lflow.c          |  21 ++-
>  ovn/controller/lflow.h          |   3 +-
>  ovn/controller/ovn-controller.c |   7 +-
>  ovn/northd/ovn-northd.c         | 360
> +++++++++++++++++++++++++++++++++++++---
>  7 files changed, 378 insertions(+), 43 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 67504e8..4a5aae2 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -68,7 +68,7 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_xlate);
>
>  /* Maximum depth of flow table recursion (due to resubmit actions) in a
>   * flow translation. */
> -#define MAX_RESUBMIT_RECURSION 64
> +#define MAX_RESUBMIT_RECURSION 96
>  #define MAX_INTERNAL_RESUBMITS 1   /* Max resbmits allowed using rules in
>                                        internal table. */
>
> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> index d3ca9c9..f4e0f4a 100644
> --- a/ovn/controller/binding.c
> +++ b/ovn/controller/binding.c
> @@ -49,7 +49,7 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
>                           &ovsrec_interface_col_ingress_policing_burst);
>  }
>
> -static void
> +void
>  get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash
> *lports)
>  {
>      int i;
> @@ -149,7 +149,8 @@ update_qos(const struct ovsrec_interface *iface_rec,
>  void
>  binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge
> *br_int,
>              const char *chassis_id, struct simap *ct_zones,
> -            unsigned long *ct_zone_bitmap, struct hmap *local_datapaths)
> +            unsigned long *ct_zone_bitmap, struct hmap *local_datapaths,
> +            struct sset *all_lports)
>  {
>      const struct sbrec_chassis *chassis_rec;
>      const struct sbrec_port_binding *binding_rec;
> @@ -167,10 +168,9 @@ binding_run(struct controller_ctx *ctx, const struct
> ovsrec_bridge *br_int,
>           * We'll remove our chassis from all port binding records below.
> */
>      }
>
> -    struct sset all_lports = SSET_INITIALIZER(&all_lports);
>      struct shash_node *node;
>      SHASH_FOR_EACH (node, &lports) {
> -        sset_add(&all_lports, node->name);
> +        sset_add(all_lports, node->name);
>      }
>
>      /* Run through each binding record to see if it is resident on this
> @@ -181,10 +181,10 @@ binding_run(struct controller_ctx *ctx, const struct
> ovsrec_bridge *br_int,
>              = shash_find_and_delete(&lports, binding_rec->logical_port);
>          if (iface_rec
>              || (binding_rec->parent_port && binding_rec->parent_port[0] &&
> -                sset_contains(&all_lports, binding_rec->parent_port))) {
> +                sset_contains(all_lports, binding_rec->parent_port))) {
>              if (binding_rec->parent_port && binding_rec->parent_port[0]) {
>                  /* Add child logical port to the set of all local ports.
> */
> -                sset_add(&all_lports, binding_rec->logical_port);
> +                sset_add(all_lports, binding_rec->logical_port);
>              }
>              add_local_datapath(local_datapaths, binding_rec);
>              if (iface_rec && ctx->ovs_idl_txn) {
> @@ -217,7 +217,14 @@ binding_run(struct controller_ctx *ctx, const struct
> ovsrec_bridge *br_int,
>               * to list them in all_lports because we want to allocate
>               * a conntrack zone ID for each one, as we'll be creating
>               * a patch port for each one. */
> -            sset_add(&all_lports, binding_rec->logical_port);
> +            sset_add(all_lports, binding_rec->logical_port);
> +        }
> +        else if (!binding_rec->chassis
> +                           && !strcmp(binding_rec->type, "floating-ip")) {
> +            const char *peer = smap_get(&binding_rec->options, "peer");
> +            if (peer && sset_contains(all_lports, peer)) {
> +                    add_local_datapath(local_datapaths, binding_rec);
> +            }
>          }
>      }
>
> @@ -225,10 +232,9 @@ binding_run(struct controller_ctx *ctx, const struct
> ovsrec_bridge *br_int,
>          VLOG_DBG("No port binding record for lport %s", node->name);
>      }
>
> -    update_ct_zones(&all_lports, ct_zones, ct_zone_bitmap);
> +    update_ct_zones(all_lports, ct_zones, ct_zone_bitmap);
>
>      shash_destroy(&lports);
> -    sset_destroy(&all_lports);
>  }
>
>  /* Returns true if the database is all cleaned up, false if more work is
> diff --git a/ovn/controller/binding.h b/ovn/controller/binding.h
> index 6e19c10..73e6b0c 100644
> --- a/ovn/controller/binding.h
> +++ b/ovn/controller/binding.h
> @@ -24,11 +24,13 @@ struct hmap;
>  struct ovsdb_idl;
>  struct ovsrec_bridge;
>  struct simap;
> +struct sset;
>
>  void binding_register_ovs_idl(struct ovsdb_idl *);
>  void binding_run(struct controller_ctx *, const struct ovsrec_bridge
> *br_int,
>                   const char *chassis_id, struct simap *ct_zones,
> -                 unsigned long *ct_zone_bitmap, struct hmap
> *local_datapaths);
> +                 unsigned long *ct_zone_bitmap, struct hmap
> *local_datapaths,
> +                 struct sset *all_lports);
>  bool binding_cleanup(struct controller_ctx *, const char *chassis_id);
>
>  #endif /* ovn/binding.h */
> diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> index 0614a54..a59d26f 100644
> --- a/ovn/controller/lflow.c
> +++ b/ovn/controller/lflow.c
> @@ -16,6 +16,7 @@
>  #include <config.h>
>  #include "lflow.h"
>  #include "lport.h"
> +#include "lib/sset.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "ofctrl.h"
>  #include "ofp-actions.h"
> @@ -198,7 +199,8 @@ static void
>  add_logical_flows(struct controller_ctx *ctx, const struct lport_index
> *lports,
>                    const struct mcgroup_index *mcgroups,
>                    const struct hmap *local_datapaths,
> -                  const struct simap *ct_zones, struct hmap *flow_table)
> +                  const struct simap *ct_zones, struct hmap *flow_table,
> +                  struct sset *local_ports)
>  {
>      uint32_t conj_id_ofs = 1;
>
> @@ -240,6 +242,18 @@ add_logical_flows(struct controller_ctx *ctx, const
> struct lport_index *lports,
>              }
>          }
>
> +        /* The following check is specifically for floating-ip ports.
> +         * This will prevent from installing the arp request rule for
> +         * floating ip, unless the lport in the flow points to a local
> +         * port which is a resident on this chassis */
> +        const char *lport = smap_get(&lflow->external_ids, "lport");
> +        if (lport) {
> +            if (!sset_contains(local_ports, lport)) {
> +                continue;
> +            }
> +        }
> +
> +
>          /* Determine translation of logical table IDs to physical table
> IDs. */
>          uint8_t first_ptable = (ingress
>                                  ? OFTABLE_LOG_INGRESS_PIPELINE
> @@ -416,10 +430,11 @@ void
>  lflow_run(struct controller_ctx *ctx, const struct lport_index *lports,
>            const struct mcgroup_index *mcgroups,
>            const struct hmap *local_datapaths,
> -          const struct simap *ct_zones, struct hmap *flow_table)
> +          const struct simap *ct_zones, struct hmap *flow_table,
> +          struct sset *local_ports)
>  {
>      add_logical_flows(ctx, lports, mcgroups, local_datapaths,
> -                      ct_zones, flow_table);
> +                      ct_zones, flow_table, local_ports);
>      add_neighbor_flows(ctx, lports, flow_table);
>  }
>
> diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
> index ff823d4..3147e5c 100644
> --- a/ovn/controller/lflow.h
> +++ b/ovn/controller/lflow.h
> @@ -41,6 +41,7 @@ struct lport_index;
>  struct mcgroup_index;
>  struct simap;
>  struct uuid;
> +struct sset;
>
>  /* OpenFlow table numbers.
>   *
> @@ -63,7 +64,7 @@ void lflow_run(struct controller_ctx *, const struct
> lport_index *,
>                 const struct mcgroup_index *,
>                 const struct hmap *local_datapaths,
>                 const struct simap *ct_zones,
> -               struct hmap *flow_table);
> +               struct hmap *flow_table, struct sset *local_ports);
>  void lflow_destroy(void);
>
>  #endif /* ovn/lflow.h */
> diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> index e52b731..3e0b8e3 100644
> --- a/ovn/controller/ovn-controller.c
> +++ b/ovn/controller/ovn-controller.c
> @@ -33,6 +33,7 @@
>  #include "encaps.h"
>  #include "fatal-signal.h"
>  #include "hmap.h"
> +#include "sset.h"
>  #include "lflow.h"
>  #include "lib/vswitch-idl.h"
>  #include "lport.h"
> @@ -284,12 +285,13 @@ main(int argc, char *argv[])
>
>          const struct ovsrec_bridge *br_int = get_br_int(&ctx);
>          const char *chassis_id = get_chassis_id(ctx.ovs_idl);
> +        struct sset local_ports = SSET_INITIALIZER(&local_ports);
>
>          if (chassis_id) {
>              chassis_run(&ctx, chassis_id);
>              encaps_run(&ctx, br_int, chassis_id);
>              binding_run(&ctx, br_int, chassis_id, &ct_zones,
> ct_zone_bitmap,
> -                    &local_datapaths);
> +                    &local_datapaths, &local_ports);
>          }
>
>          if (br_int) {
> @@ -306,7 +308,8 @@ main(int argc, char *argv[])
>
>              struct hmap flow_table = HMAP_INITIALIZER(&flow_table);
>              lflow_run(&ctx, &lports, &mcgroups, &local_datapaths,
> -                      &ct_zones, &flow_table);
> +                      &ct_zones, &flow_table, &local_ports);
> +            sset_destroy(&local_ports);
>              if (chassis_id) {
>                  physical_run(&ctx, mff_ovn_geneve,
>                               br_int, chassis_id, &ct_zones, &flow_table,
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 598bbe3..12e7ebd 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -102,9 +102,11 @@ enum ovn_stage {
>      /* Logical router ingress stages. */                              \
>      PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission")    \
>      PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input")     \
> -    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing")   \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve")  \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request")  \
> +    PIPELINE_STAGE(ROUTER, IN,  IP_DNAT,     2, "lr_in_ip_dnat")      \
> +    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  3, "lr_in_ip_routing")   \
> +    PIPELINE_STAGE(ROUTER, IN,  IP_SNAT,     4, "lr_in_ip_snat")      \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 5, "lr_in_arp_resolve")  \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 6, "lr_in_arp_request")  \
>                                                                        \
>      /* Logical router egress stages. */                               \
>      PIPELINE_STAGE(ROUTER, OUT, DELIVERY,    0, "lr_out_delivery")
> @@ -479,6 +481,7 @@ struct ovn_port {
>      ovs_be32 ip, mask;          /* 192.168.10.123/24. */
>      ovs_be32 network;           /* 192.168.10.0. */
>      ovs_be32 bcast;             /* 192.168.10.255. */
> +    ovs_be32 fixed_ip;          /* fixed-ip for floating-ip */
>      struct eth_addr mac;
>      struct ovn_port *peer;
>
> @@ -541,6 +544,20 @@ ovn_port_allocate_key(struct ovn_datapath *od)
>                            (1u << 15) - 1, &od->port_key_hint);
>  }
>
> +static const char *
> +get_router_port_for_floating_ip(struct ovn_port *op, struct hmap *ports)
> +{
> +    const char *lrp_name = smap_get(&op->nbs->options, "router-port");
> +    if (lrp_name) {
> +        struct ovn_port *lrp = ovn_port_find(ports, lrp_name);
> +        if (lrp && lrp->nbs)
> +        {
> +            return lrp->json_key;
> +        }
> +    }
> +    return op->json_key;
> +}
> +
>  static void
>  join_logical_ports(struct northd_context *ctx,
>                     struct hmap *datapaths, struct hmap *ports,
> @@ -671,10 +688,35 @@ join_logical_ports(struct northd_context *ctx,
>              op->peer = ovn_port_find(ports, op->nbr->name);
>          }
>      }
> +
> +    HMAP_FOR_EACH (op, key_node, ports) {
> +        if (op->nbs && !strcmp(op->nbs->type, "floating-ip")) {
> +            const char *peer_name = smap_get(&op->nbs->options,
> +                                             "fixed-ip-port");
> +            if (!peer_name) {
> +                continue;
> +            }
> +
> +            struct ovn_port *peer = ovn_port_find(ports, peer_name);
> +            if (!peer || !peer->nbs) {
> +                continue;
> +            }
> +            struct eth_addr mac;
> +            ovs_be32 ip;
> +
> +            /* Not sure if a port with multiple IP addresses can be
> +             * mapped to a floating-ip. For now, just using first ip */
> +            if (ovs_scan(peer->nbs->addresses[0],
> +                     ETH_ADDR_SCAN_FMT" "IP_SCAN_FMT,
> +                     ETH_ADDR_SCAN_ARGS(mac), IP_SCAN_ARGS(&ip))) {
> +                op->fixed_ip = ip;
> +            }
> +        }
> +    }
>  }
>
>  static void
> -ovn_port_update_sbrec(const struct ovn_port *op)
> +ovn_port_update_sbrec(const struct ovn_port *op, struct hmap *ports)
>  {
>      sbrec_port_binding_set_datapath(op->sb, op->od->sb);
>      if (op->nbr) {
> @@ -688,7 +730,20 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>          sbrec_port_binding_set_tag(op->sb, NULL, 0);
>          sbrec_port_binding_set_mac(op->sb, NULL, 0);
>      } else {
> -        if (strcmp(op->nbs->type, "router")) {
> +        if (op->nbs && !strcmp(op->nbs->type, "floating-ip")) {
> +            const char *peer_name = smap_get(&op->nbs->options,
> +                                             "fixed-ip-port");
> +            if (peer_name) {
> +                struct ovn_port *peer = ovn_port_find(ports, peer_name);
> +                if (peer) {
> +                    const struct smap ids = SMAP_CONST1(&ids, "peer",
> +                                             peer_name);
> +                    sbrec_port_binding_set_options(op->sb, &ids);
> +                }
> +            }
> +            sbrec_port_binding_set_type(op->sb, op->nbs->type);
> +        }
> +        else if (strcmp(op->nbs->type, "router")) {
>              sbrec_port_binding_set_type(op->sb, op->nbs->type);
>              sbrec_port_binding_set_options(op->sb, &op->nbs->options);
>          } else {
> @@ -727,7 +782,7 @@ build_ports(struct northd_context *ctx, struct hmap
> *datapaths,
>       * record based on northbound data.  Also index the in-use
> tunnel_keys. */
>      struct ovn_port *op, *next;
>      LIST_FOR_EACH_SAFE (op, next, list, &both) {
> -        ovn_port_update_sbrec(op);
> +        ovn_port_update_sbrec(op, ports);
>
>          add_tnlid(&op->od->port_tnlids, op->sb->tunnel_key);
>          if (op->sb->tunnel_key > op->od->port_key_hint) {
> @@ -743,7 +798,7 @@ build_ports(struct northd_context *ctx, struct hmap
> *datapaths,
>          }
>
>          op->sb = sbrec_port_binding_insert(ctx->ovnsb_txn);
> -        ovn_port_update_sbrec(op);
> +        ovn_port_update_sbrec(op, ports);
>
>          sbrec_port_binding_set_logical_port(op->sb, op->key);
>          sbrec_port_binding_set_tunnel_key(op->sb, tunnel_key);
> @@ -869,6 +924,8 @@ struct ovn_lflow {
>      uint16_t priority;
>      char *match;
>      char *actions;
> +    char *lport; /* is not null, indicates that the flow should be
> installed
> +                    on a chassis if the lport is local to that chassis */
>  };
>
>  static size_t
> @@ -900,6 +957,18 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct
> ovn_datapath *od,
>      lflow->priority = priority;
>      lflow->match = match;
>      lflow->actions = actions;
> +    lflow->lport = NULL;
> +}
> +
> +static void
> +ovn_lflow_lport_set(struct ovn_lflow *lflow, const char *lport_name)
> +{
> +    if (lport_name) {
> +        lflow->lport = xstrdup(lport_name);
> +    }
> +    else {
> +        lflow->lport = NULL;
> +    }
>  }
>
>  /* Adds a row with the specified contents to the Logical_Flow table. */
> @@ -1155,7 +1224,8 @@ build_port_security_ipv6_flow(
>   *   - Priority 80 flow to drop ARP and IPv6 ND packets.
>   */
>  static void
> -build_port_security_nd(struct ovn_port *op, struct hmap *lflows)
> +build_port_security_nd(struct ovn_port *op, struct hmap *lflows,
> +                       struct hmap *ports)
>  {
>      for (size_t i = 0; i < op->nbs->n_port_security; i++) {
>          struct lport_addresses ps;
> @@ -1168,11 +1238,19 @@ build_port_security_nd(struct ovn_port *op, struct
> hmap *lflows)
>
>          bool no_ip = !(ps.n_ipv4_addrs || ps.n_ipv6_addrs);
>          struct ds match = DS_EMPTY_INITIALIZER;
> +
> +        const char *inport = NULL;
> +        if (!strcmp(op->nbs->type, "floating-ip")) {
> +            inport = get_router_port_for_floating_ip(op, ports);
> +        }
> +        else {
> +            inport = op->json_key;
> +        }
>
>          if (ps.n_ipv4_addrs || no_ip) {
>              ds_put_format(
>                  &match, "inport == %s && eth.src == "ETH_ADDR_FMT" &&
> arp.sha == "
> -                ETH_ADDR_FMT, op->json_key, ETH_ADDR_ARGS(ps.ea),
> +                ETH_ADDR_FMT, inport, ETH_ADDR_ARGS(ps.ea),
>                  ETH_ADDR_ARGS(ps.ea));
>
>              if (ps.n_ipv4_addrs) {
> @@ -1228,7 +1306,7 @@ build_port_security_nd(struct ovn_port *op, struct
> hmap *lflows)
>   */
>  static void
>  build_port_security_ip(enum ovn_pipeline pipeline, struct ovn_port *op,
> -                       struct hmap *lflows)
> +                       struct hmap *lflows, struct hmap *ports)
>  {
>      char *port_direction;
>      enum ovn_stage stage;
> @@ -1250,16 +1328,25 @@ build_port_security_ip(enum ovn_pipeline pipeline,
> struct ovn_port *op,
>              continue;
>          }
>
> +        const char *port = NULL;
> +        if (!strcmp(op->nbs->type, "floating-ip")) {
> +            port = get_router_port_for_floating_ip(op, ports);
> +        }
> +        else {
> +            port = op->json_key;
> +        }
> +
> +
>          if (ps.n_ipv4_addrs) {
>              struct ds match = DS_EMPTY_INITIALIZER;
>              if (pipeline == P_IN) {
>                  ds_put_format(&match, "inport == %s && eth.src ==
> "ETH_ADDR_FMT
> -                              " && ip4.src == {0.0.0.0, ", op->json_key,
> +                              " && ip4.src == {0.0.0.0, ", port,
>                                ETH_ADDR_ARGS(ps.ea));
>              } else {
>                  ds_put_format(&match, "outport == %s && eth.dst ==
> "ETH_ADDR_FMT
>                                " && ip4.dst == {255.255.255.255,
> 224.0.0.0/4, ",
> -                              op->json_key, ETH_ADDR_ARGS(ps.ea));
> +                              port, ETH_ADDR_ARGS(ps.ea));
>              }
>
>              for (int i = 0; i < ps.n_ipv4_addrs; i++) {
> @@ -1525,18 +1612,26 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>              continue;
>          }
>
> +        const char *inport = NULL;
> +        if (!strcmp(op->nbs->type, "floating-ip")) {
> +            inport = get_router_port_for_floating_ip(op, ports);
> +        }
> +        else {
> +            inport = op->json_key;
> +        }
> +
>          struct ds match = DS_EMPTY_INITIALIZER;
> -        ds_put_format(&match, "inport == %s", op->json_key);
> +        ds_put_format(&match, "inport == %s", inport);
>          build_port_security_l2(
> -            "eth.src", op->nbs->port_security, op->nbs->n_port_security,
> +                       "eth.src", op->nbs->port_security,
> op->nbs->n_port_security,
>              &match);
>          ovn_lflow_add(lflows, op->od, S_SWITCH_IN_PORT_SEC_L2, 50,
>                        ds_cstr(&match), "next;");
>          ds_destroy(&match);
>
>          if (op->nbs->n_port_security) {
> -            build_port_security_ip(P_IN, op, lflows);
> -            build_port_security_nd(op, lflows);
> +            build_port_security_ip(P_IN, op, lflows, ports);
> +            build_port_security_nd(op, lflows, ports);
>          }
>      }
>
> @@ -1578,10 +1673,26 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>           *  - port is up or
>           *  - port type is router
>           */
> -        if (!lport_is_up(op->nbs) && strcmp(op->nbs->type, "router")) {
> +        if (!lport_is_up(op->nbs) && strcmp(op->nbs->type, "router") &&
> +                                     strcmp(op->nbs->type,
> "floating-ip")) {
>              continue;
>          }
>
> +        uint16_t priority = 0;
> +        if (!strcmp(op->nbs->type, "floating-ip")) {
> +            const char *peer_name = smap_get(&op->nbs->options,
> +                                             "fixed-ip-port");
> +            if (peer_name) {
> +                priority = 150;
> +            }
> +            else {
> +                priority = 50;
> +            }
> +        }
> +        else {
> +            priority = 50;
> +        }
> +
>          for (size_t i = 0; i < op->nbs->n_addresses; i++) {
>              struct lport_addresses laddrs;
>              if (!extract_lport_addresses(op->nbs->addresses[i], &laddrs,
> @@ -1606,8 +1717,20 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>                      ETH_ADDR_ARGS(laddrs.ea),
>                      ETH_ADDR_ARGS(laddrs.ea),
>                      IP_ARGS(laddrs.ipv4_addrs[j].addr));
> -                ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_RSP, 50,
> +                ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_RSP,
> priority,
>                                match, actions);
> +                if (!strcmp(op->nbs->type, "floating-ip")) {
> +                    const char *peer_name = smap_get(&op->nbs->options,
> +                                                     "fixed-ip-port");
> +                    struct ovn_lflow *lflow = ovn_lflow_find(lflows,
> op->od,
> +                            S_SWITCH_IN_ARP_RSP, priority, match,
> actions);
> +                    /* Setting the lport option in external_ids of lflow,
> so
> +                     * that the controller will pick up this flow only if
> the
> +                     * lport is a local port on the chassis */
> +                    if (lflow) {
> +                        ovn_lflow_lport_set(lflow, peer_name);
> +                    }
> +                }
>                  free(match);
>                  free(actions);
>              }
> @@ -1662,8 +1785,15 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>                  ds_put_format(&match, "eth.dst == "ETH_ADDR_FMT,
>                                ETH_ADDR_ARGS(mac));
>
> +                const char *outport = NULL;
> +                if (!strcmp(op->nbs->type, "floating-ip")) {
> +                    outport = get_router_port_for_floating_ip(op, ports);
> +                }
> +                else {
> +                    outport = op->json_key;
> +                }
>                  ds_init(&actions);
> -                ds_put_format(&actions, "outport = %s; output;",
> op->json_key);
> +                ds_put_format(&actions, "outport = %s; output;", outport);
>                  ovn_lflow_add(lflows, op->od, S_SWITCH_IN_L2_LKUP, 50,
>                                ds_cstr(&match), ds_cstr(&actions));
>                  ds_destroy(&actions);
> @@ -1722,8 +1852,15 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>              continue;
>          }
>
> +        const char *outport = NULL;
> +        if (!strcmp(op->nbs->type, "floating-ip")) {
> +            outport = get_router_port_for_floating_ip(op, ports);
> +        }
> +        else {
> +            outport = op->json_key;
> +        }
>          struct ds match = DS_EMPTY_INITIALIZER;
> -        ds_put_format(&match, "outport == %s", op->json_key);
> +        ds_put_format(&match, "outport == %s", outport);
>          if (lport_is_enabled(op->nbs)) {
>              build_port_security_l2("eth.dst", op->nbs->port_security,
>                                     op->nbs->n_port_security, &match);
> @@ -1737,7 +1874,7 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>          ds_destroy(&match);
>
>          if (op->nbs->n_port_security) {
> -            build_port_security_ip(P_OUT, op, lflows);
> +            build_port_security_ip(P_OUT, op, lflows, ports);
>          }
>      }
>  }
> @@ -1819,6 +1956,48 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          free(match);
>      }
>
> +    /* Logical router ingress table 0: match (priority 50).
> +     * The following rules allow packets with mac address
> +     * of floating ip ports ingressing on a logical router port */
> +    HMAP_FOR_EACH (od, key_node, datapaths) {
> +        if (!(od->nbr && od->gateway_port)) {
> +            continue;
> +        }
> +        struct ovn_port *lrp = od->gateway_port->peer;
> +        if (!lrp) {
> +            VLOG_ERR("No peer port for logical router port %s",
> +                        od->gateway_port->key);
> +            continue;
> +        }
> +        const struct nbrec_logical_switch *nbs = lrp->od->nbs;
> +        for (size_t i = 0 ; i < nbs->n_ports ; i++) {
> +            if (nbs->ports[i] && !strcmp(nbs->ports[i]->type,
> "floating-ip")) {
> +                const char *peer_name = smap_get(&nbs->ports[i]->options,
> +                                                 "fixed-ip-port");
> +                const char *lrp_name = smap_get(&nbs->ports[i]->options,
> +                                                 "router-port");
> +                if (!peer_name || !lrp_name) {
> +                    continue;
> +                }
> +                if (strcmp(lrp_name, lrp->key)) {
> +                    continue;
> +                }
> +                for (size_t j = 0; j < nbs->ports[i]->n_addresses; j++) {
> +                    struct eth_addr mac;
> +                    char *match;
> +                    if (eth_addr_from_string(nbs->ports[i]->addresses[j],
> &mac)) {
> +                        match = xasprintf("(eth.mcast || eth.dst == "
> +                           ETH_ADDR_FMT") && inport == %s",
> +                           ETH_ADDR_ARGS(mac),
> od->gateway_port->json_key);
> +                        ovn_lflow_add(lflows, od, S_ROUTER_IN_ADMISSION,
> +                                      50, match, "next;");
> +                        free(match);
> +                    }
> +                }
> +            }
> +        }
> +    }
> +
>      /* Logical router ingress table 1: IP Input. */
>      HMAP_FOR_EACH (od, key_node, datapaths) {
>          if (!od->nbr) {
> @@ -1928,7 +2107,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          free(match);
>      }
>
> -    /* Logical router ingress table 2: IP Routing.
> +    /* Logical router ingress table 3: IP Routing.
>       *
>       * A packet that arrives at this table is an IP packet that should be
>       * routed to the address in ip4.dst. This table sets outport to the
> correct
> @@ -1953,7 +2132,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>      }
>      /* XXX destination unreachable */
>
> -    /* Local router ingress table 3: ARP Resolution.
> +    /* Local router ingress table 5: ARP Resolution.
>       *
>       * Any packet that reaches this table is an IP packet whose next-hop
> IP
>       * address is in reg0. (ip4.dst is the final destination.) This table
> @@ -2021,7 +2200,7 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>                        "get_arp(outport, reg0); next;");
>      }
>
> -    /* Local router ingress table 4: ARP request.
> +    /* Local router ingress table 6: ARP request.
>       *
>       * In the common case where the Ethernet destination has been
> resolved,
>       * this table outputs the packet (priority 100).  Otherwise, it
> composes
> @@ -2042,6 +2221,131 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
>          ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1",
> "output;");
>      }
>
> +    /* DNAT & SNAT tables /
> +     *
> +     * Priority 100 rule in IN_IP_DNAT to set reg2 to 0x1 if dst ip &
> +     * src ip networks are connected via the same router.
> +     *
> +     * Priority 100 rule in IN_IP_SNAT to skip modifying the src ip when
> +     * reg2 is set to 0x1.
> +     *
> +     * Priority 90 rule in IN_IP_DNAT to modify dst ip from floating-ip
> +     * vm-ip.
> +     *
> +     * Priority 90 rule in IN_IP_SNAT to modify src ip from vm ip to
> +     * floating ip and dst mac to floating ip port mac if the packet is
> +     * egressing via the gateway port.
> +     *
> +     * Priority 50 rule in IP_IP_SNAT to modify src ip from vm ip to
> +     * floating ip.
> +     *
> +     * Pririty 0 rule to go to next table if none of the above rules
> match.
> +     */
> +
> +    HMAP_FOR_EACH (op, key_node, ports) {
> +        if (!(op->nbs && !strcmp(op->nbs->type, "floating-ip"))) {
> +            continue;
> +        }
> +        const char *peer_name = smap_get(&op->nbs->options,
> "fixed-ip-port");
> +        const char *lrp_name = smap_get(&op->nbs->options, "router-port");
> +        if (!peer_name || !lrp_name) {
> +            continue;
> +        }
> +        struct ovn_port *lrp = ovn_port_find(ports, lrp_name);
> +        if (!lrp) {
> +            continue;
> +        }
> +        for (size_t i = 0; i < op->nbs->n_addresses; i++) {
> +            char *match;
> +            char *actions;
> +            struct eth_addr mac;
> +            ovs_be32 ip;
> +            if (ovs_scan(op->nbs->addresses[i],
> +                     ETH_ADDR_SCAN_FMT" "IP_SCAN_FMT,
> +                     ETH_ADDR_SCAN_ARGS(mac), IP_SCAN_ARGS(&ip))) {
> +                match = xasprintf("ip4.dst == "IP_FMT"", IP_ARGS(ip));
> +                actions = xasprintf("ip4.dst = "IP_FMT"; inport = \"\";
> next;",
> +                                     IP_ARGS(op->fixed_ip));
> +                ovn_lflow_add(lflows, lrp->peer->od, S_ROUTER_IN_IP_DNAT,
> +                                90, match, actions);
> +                free(match);
> +                free(actions);
> +
> +                match = xasprintf("(ip4.src == "IP_FMT") && outport ==
> %s",
> +                        IP_ARGS(op->fixed_ip), lrp->peer->json_key);
> +                actions = xasprintf("eth.src = "ETH_ADDR_FMT";"
> +                                    " ip4.src = "IP_FMT"; next;",
> +                                    ETH_ADDR_ARGS(mac), IP_ARGS(ip));
> +                ovn_lflow_add(lflows, lrp->peer->od,
> +                              S_ROUTER_IN_IP_SNAT, 90, match, actions);
> +                free(match);
> +                free(actions);
> +
> +                match = xasprintf("ip4.src == "IP_FMT"",
> +                                   IP_ARGS(op->fixed_ip));
> +                actions = xasprintf("ip4.src = "IP_FMT";
> next;",IP_ARGS(ip));
> +                ovn_lflow_add(lflows, lrp->peer->od,
> +                              S_ROUTER_IN_IP_SNAT, 50, match, actions);
> +                free(match);
> +                free(actions);
> +            }
> +        }
> +    }
> +
> +    HMAP_FOR_EACH(od, key_node, datapaths) {
> +        if (!od->nbr) {
> +            continue;
> +        }
> +
> +        /* Default rules for DNAT & SNAT tables with priority 0. */
> +        if (od->gateway_port) {
> +            ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_DNAT, 0, "1",
> "next;");
> +            ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SNAT, 0, "1",
> "next;");
> +        }
> +
> +        /* The following rules in DNAT & SNAT tables will prevent NAT
> when the
> +         * src & dst ips belong to private networks that are connected
> via a
> +         * router */
> +        bool add_snat_flow = false;
> +        for (size_t j = 0; j < od->nbr->n_ports;j++) {
> +            if (od->gateway_port && !strcmp(od->nbr->ports[j]->name,
> +                                            od->gateway_port->key)) {
> +                continue;
> +            }
> +            ovs_be32 ip1, ip2, mask1, mask2;
> +            char *error = ip_parse_masked(od->nbr->ports[j]->network,
> &ip1, &mask1);
> +            if (error || mask1 == OVS_BE32_MAX || !ip_is_cidr(mask1)) {
> +                free(error);
> +                continue;
> +            }
> +            for (size_t l = 0; l < od->nbr->n_ports;l++) {
> +                if ((l == j) || (od->gateway_port &&
> +                                    !strcmp(od->nbr->ports[l]->name,
> +                                            od->gateway_port->key))) {
> +                    continue;
> +                }
> +                char *error = ip_parse_masked(od->nbr->ports[l]->network,
> &ip2, &mask2);
> +                if (error || mask2 == OVS_BE32_MAX || !ip_is_cidr(mask2))
> {
> +                    free(error);
> +                    continue;
> +                }
> +                char *match = xasprintf("(ip4.src == "IP_FMT"/"IP_FMT")
> && "
> +                                   "(ip4.dst == "IP_FMT"/"IP_FMT")",
> +                                   IP_ARGS(ip1 & mask1), IP_ARGS(mask1),
> +                                   IP_ARGS(ip2 & mask2), IP_ARGS(mask2));
> +                ovn_lflow_add(lflows, od,
> +                              S_ROUTER_IN_IP_DNAT, 100, match,
> +                              "reg2 = 1; next;");
> +                free(match);
> +                add_snat_flow = true;
> +            }
> +        }
> +        if (add_snat_flow) {
> +            ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_SNAT, 100,
> +                          "reg2 == 1", "next;");
> +        }
> +    }
> +
>      /* Logical router egress table 0: Delivery (priority 100).
>       *
>       * Priority 100 rules deliver packets to enabled logical ports. */
> @@ -2111,8 +2415,12 @@ build_lflows(struct northd_context *ctx, struct
> hmap *datapaths,
>          sbrec_logical_flow_set_match(sbflow, lflow->match);
>          sbrec_logical_flow_set_actions(sbflow, lflow->actions);
>
> -        const struct smap ids = SMAP_CONST1(&ids, "stage-name",
> -
> ovn_stage_to_str(lflow->stage));
> +        struct smap ids;
> +        smap_init(&ids);
> +        if (lflow->lport) {
> +            smap_add(&ids, "lport", lflow->lport);
> +        }
> +        smap_add(&ids, "stage-name", ovn_stage_to_str(lflow->stage));
>          sbrec_logical_flow_set_external_ids(sbflow, &ids);
>
>          ovn_lflow_destroy(&lflows, lflow);
> --
> 2.6.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to