On Mon, Mar 25, 2013 at 08:18:02AM +0900, Simon Horman wrote: > Recirculation is a technique to allow a frame to re-enter > frame processing. This is intended to be used after actions > have been applied to the frame with modify the frame in > some way that makes it possible for richer processing to occur. > > An example is and indeed targeted use case is MPLS. If an MPLS frame has an > mpls_pop action applied with the IPv4 ethernet type then it becomes > possible to decode the IPv4 portion of the frame. This may be used to > construct a facet that modifies the IPv4 portion of the frame. This is not > possible prior to the mpls_pop action as the contents of the frame after > the MPLS stack is not known to be IPv4. > > Design: > * New recirculation action. > > ovs-vswitchd adds a recirculation action to the end of a list of > datapath actions for a flow when the actions are truncated because > insufficient flow match information is available to add the next > OpenFlow action. The recirculation action is preceded by an action > to set the skb_mark to an id which can be used to scope a facet lookup > of a recirculated packet. > > e.g. pop_mpls(0x0800),dec_ttl becomes > pop_mpls(0x800),set(skb_mark(id)),recirculate > > * Datapath behaviour > > Then the datapath encounters a recirculate action it: > + Recalculates the flow key based on the packet > which will typically have been modified by previous actions > + As the recirculate action is preceded by a set(skb_mark(id)) action, > the new match key will now include skb_mark=id. > + Performs a lookup using the new match key > + Processes the packet if a facet matches the key or; > + Makes an upcall if necessary > > Note that the implementation corresponding to the design above > currently requires facets to be used. It does not support packet out. > > Signed-off-by: Simon Horman <ho...@verge.net.au>
Hi Jesse, a gentle nudge for a review of this. I believe it addresses all of the issues you raised in regards to rfc1. > --- > > This patch depends on "[PATCH v2.23] datapath: Add basic MPLS support to > kernel". > There are currently no other patches in the recirculation series. > > For reference this patch is available in git at: > git://github.com/horms/openvswitch.git devel/mpls-recirculate.rfc2 > > TODO: > > * More sensible handling of recirculation IDs [ovs-vswtichd] > * More sensible lookup of facets based on recirculation IDs [ovs-vswtichd] > * Look more closely at facet revalidation [ovs-vswtichd] > * Handle recirculation for packet out > * Do not force misses all MPLS frame misses to be handled with facets. > Instead either more accurately detect if recirculation will occur > or allow recirculation to occur without facets. > > rfc2 > * As suggested by Jesse Gross > - Update for changes to ovs_dp_process_received_packet() > to no longer check if OVS_CB(skb)->flow is pre-initialised. > - Do not add spurious printk debugging to ovs_execute_actions() > - Do not add spurious debugging messages to commit_set_nw_action() > - Correct typo in comment above commit_odp_actions(). > - Do not execute recirculation in ovs-vswitchd, rather allow > the datapath to make an upcall when a recirculation action > is encountered on execute. > + This implicitly breaks support for recirculation without facets, > so for now force all misses of MPLS frames to be handled with > a facet; and treat handling of recirculation for packet_out as > a todo item. > - Use skb_mark for recirculation_id in match. This avoids > both expanding the match and including a recirculation_id parameter > with the recirculation action: set_skb_mark should be used before > the recirculation action. > - Tidy up ownership of skb in ovs_execute_actions > > rfc1 > * Initial post > > --- > datapath/actions.c | 9 +- > datapath/datapath.c | 98 +++++++++++------- > datapath/datapath.h | 2 +- > include/linux/openvswitch.h | 4 + > lib/dpif-netdev.c | 89 +++++++++++----- > lib/flow.h | 3 + > lib/odp-util.c | 15 ++- > lib/odp-util.h | 1 + > ofproto/ofproto-dpif.c | 236 > +++++++++++++++++++++++++++++++++++++------ > 9 files changed, 359 insertions(+), 98 deletions(-) > > diff --git a/datapath/actions.c b/datapath/actions.c > index be2b8a3..e69fcb1 100644 > --- a/datapath/actions.c > +++ b/datapath/actions.c > @@ -609,6 +609,9 @@ static int do_execute_actions(struct datapath *dp, struct > sk_buff *skb, > case OVS_ACTION_ATTR_SAMPLE: > err = sample(dp, skb, a); > break; > + > + case OVS_ACTION_ATTR_RECIRCULATE: > + return 1; > } > > if (unlikely(err)) { > @@ -649,7 +652,7 @@ static int loop_suppress(struct datapath *dp, struct > sw_flow_actions *actions) > } > > /* Execute a list of actions against 'skb'. */ > -int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb) > +struct sk_buff *ovs_execute_actions(struct datapath *dp, struct sk_buff *skb) > { > struct sw_flow_actions *acts = > rcu_dereference(OVS_CB(skb)->flow->sf_acts); > struct loop_counter *loop; > @@ -668,6 +671,8 @@ int ovs_execute_actions(struct datapath *dp, struct > sk_buff *skb) > OVS_CB(skb)->tun_key = NULL; > error = do_execute_actions(dp, skb, acts->actions, > acts->actions_len, false); > + if (likely(error <= 0)) > + skb = NULL; > > /* Check whether sub-actions looped too much. */ > if (unlikely(loop->looping)) > @@ -678,5 +683,5 @@ out_loop: > if (!--loop->count) > loop->looping = false; > > - return error; > + return (error < 0) ? ERR_PTR(error) : skb; > } > diff --git a/datapath/datapath.c b/datapath/datapath.c > index 9c30168..fb92f08 100644 > --- a/datapath/datapath.c > +++ b/datapath/datapath.c > @@ -195,52 +195,63 @@ void ovs_dp_detach_port(struct vport *p) > ovs_vport_del(p); > } > > +#define MAX_RECIRCULATION_DEPTH 4 /* Completely arbitrary */ > + > /* Must be called with rcu_read_lock. */ > void ovs_dp_process_received_packet(struct vport *p, struct sk_buff *skb) > { > struct datapath *dp = p->dp; > - struct sw_flow *flow; > struct dp_stats_percpu *stats; > - struct sw_flow_key key; > - u64 *stats_counter; > - int error; > - int key_len; > + int limit = MAX_RECIRCULATION_DEPTH; > > stats = this_cpu_ptr(dp->stats_percpu); > > - /* Extract flow from 'skb' into 'key'. */ > - error = ovs_flow_extract(skb, p->port_no, &key, &key_len); > - if (unlikely(error)) { > - kfree_skb(skb); > - return; > - } > + while (1) { > + u64 *stats_counter; > + struct sw_flow *flow; > + struct sw_flow_key key; > + int error, key_len; > > - /* Look up flow. */ > - flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table), &key, key_len); > - if (unlikely(!flow)) { > - struct dp_upcall_info upcall; > - > - upcall.cmd = OVS_PACKET_CMD_MISS; > - upcall.key = &key; > - upcall.userdata = NULL; > - upcall.portid = p->upcall_portid; > - ovs_dp_upcall(dp, skb, &upcall); > - consume_skb(skb); > - stats_counter = &stats->n_missed; > - goto out; > - } > + /* Extract flow from 'skb' into 'key'. */ > + error = ovs_flow_extract(skb, p->port_no, &key, &key_len); > + if (unlikely(error)) { > + kfree_skb(skb); > + return; > + } > > - OVS_CB(skb)->flow = flow; > + /* Look up flow. */ > + flow = ovs_flow_tbl_lookup(rcu_dereference(dp->table), > + &key, key_len); > + if (unlikely(!flow)) { > + struct dp_upcall_info upcall; > + > + upcall.cmd = OVS_PACKET_CMD_MISS; > + upcall.key = &key; > + upcall.userdata = NULL; > + upcall.portid = p->upcall_portid; > + ovs_dp_upcall(dp, skb, &upcall); > + consume_skb(skb); > + stats_counter = &stats->n_missed; > + skb = NULL; > + } else { > + OVS_CB(skb)->flow = flow; > + stats_counter = &stats->n_hit; > + ovs_flow_used(flow, skb); > + skb = ovs_execute_actions(dp, skb); > + } > > - stats_counter = &stats->n_hit; > - ovs_flow_used(OVS_CB(skb)->flow, skb); > - ovs_execute_actions(dp, skb); > + /* Update datapath statistics. */ > + u64_stats_update_begin(&stats->sync); > + (*stats_counter)++; > + u64_stats_update_end(&stats->sync); > > -out: > - /* Update datapath statistics. */ > - u64_stats_update_begin(&stats->sync); > - (*stats_counter)++; > - u64_stats_update_end(&stats->sync); > + if (likely(!skb) || IS_ERR(skb)) { > + break; > + } else if (unlikely(!limit--)) { > + kfree_skb(skb); > + return; > + } > + } > } > > static struct genl_family dp_packet_genl_family = { > @@ -730,6 +741,7 @@ static int validate_and_copy_actions(const struct nlattr > *attr, > [OVS_ACTION_ATTR_POP_MPLS] = sizeof(__be16), > [OVS_ACTION_ATTR_PUSH_VLAN] = sizeof(struct > ovs_action_push_vlan), > [OVS_ACTION_ATTR_POP_VLAN] = 0, > + [OVS_ACTION_ATTR_RECIRCULATE] = 0, > [OVS_ACTION_ATTR_SET] = (u32)-1, > [OVS_ACTION_ATTR_SAMPLE] = (u32)-1 > }; > @@ -808,6 +820,9 @@ static int validate_and_copy_actions(const struct nlattr > *attr, > skip_copy = true; > break; > > + case OVS_ACTION_ATTR_RECIRCULATE: > + break; > + > default: > return -EINVAL; > } > @@ -905,12 +920,23 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, > struct genl_info *info) > goto err_unlock; > > local_bh_disable(); > - err = ovs_execute_actions(dp, packet); > + packet = ovs_execute_actions(dp, packet); > + if (unlikely(packet) && !IS_ERR(packet)) { > + struct vport *vport; > + vport = ovs_lookup_vport(dp, flow->key.phy.in_port); > + if (!vport) { > + err = -ENODEV; > + goto err_unlock; > + } > + /* Recirculate */ > + ovs_dp_process_received_packet(vport, packet); > + packet = NULL; > + } > local_bh_enable(); > rcu_read_unlock(); > > ovs_flow_free(flow); > - return err; > + return PTR_ERR(packet); > > err_unlock: > rcu_read_unlock(); > diff --git a/datapath/datapath.h b/datapath/datapath.h > index 7665742..8da5e8a 100644 > --- a/datapath/datapath.h > +++ b/datapath/datapath.h > @@ -188,7 +188,7 @@ const char *ovs_dp_name(const struct datapath *dp); > struct sk_buff *ovs_vport_cmd_build_info(struct vport *, u32 portid, u32 seq, > u8 cmd); > > -int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb); > +struct sk_buff *ovs_execute_actions(struct datapath *dp, struct sk_buff > *skb); > > unsigned char *skb_cb_mpls_stack(const struct sk_buff *skb); > #endif /* datapath.h */ > diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h > index 0dd3ee4..abb4e00 100644 > --- a/include/linux/openvswitch.h > +++ b/include/linux/openvswitch.h > @@ -513,6 +513,9 @@ struct ovs_action_push_vlan { > * indicate the new packet contents This could potentially still be > * %ETH_P_MPLS_* if the resulting MPLS label stack is not empty. If there > * is no MPLS label stack, as determined by ethertype, no action is taken. > + * @OVS_ACTION_ATTR_RECIRCULATE: Restart processing of packet. > + * The packet must have been modified by a previous action in such a way > + * that it does not match its original flow again. > * > * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not > all > * fields within a header are modifiable, e.g. the IPv4 protocol and fragment > @@ -529,6 +532,7 @@ enum ovs_action_attr { > OVS_ACTION_ATTR_SAMPLE, /* Nested OVS_SAMPLE_ATTR_*. */ > OVS_ACTION_ATTR_PUSH_MPLS, /* struct ovs_action_push_mpls. */ > OVS_ACTION_ATTR_POP_MPLS, /* __be16 ethertype. */ > + OVS_ACTION_ATTR_RECIRCULATE, /* No argument */ > __OVS_ACTION_ATTR_MAX > }; > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index e4a2f75..31255f6 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -152,10 +152,14 @@ static int dpif_netdev_open(const struct dpif_class *, > const char *name, > static int dp_netdev_output_userspace(struct dp_netdev *, const struct > ofpbuf *, > int queue_no, const struct flow *, > const struct nlattr *userdata); > -static void dp_netdev_execute_actions(struct dp_netdev *, > +static bool dp_netdev_execute_actions(struct dp_netdev *, > struct ofpbuf *, struct flow *, > const struct nlattr *actions, > - size_t actions_len); > + size_t actions_len, > + uint32_t *skb_mark); > +static void dp_netdev_port_input(struct dp_netdev *dp, > + struct dp_netdev_port *port, > + struct ofpbuf *packet); > > static struct dpif_netdev * > dpif_netdev_cast(const struct dpif *dpif) > @@ -940,8 +944,22 @@ dpif_netdev_execute(struct dpif *dpif, const struct > dpif_execute *execute) > error = dpif_netdev_flow_from_nlattrs(execute->key, execute->key_len, > &key); > if (!error) { > - dp_netdev_execute_actions(dp, ©, &key, > - execute->actions, execute->actions_len); > + bool recirculate; > + uint32_t skb_mark = 0; > + > + recirculate = dp_netdev_execute_actions(dp, ©, &key, > + execute->actions, > + execute->actions_len, > + &skb_mark); > + if (recirculate) { > + struct dp_netdev_port *port; > + port = (key.in_port < MAX_PORTS) ? dp->ports[key.in_port] : NULL; > + if (port) { > + dp_netdev_port_input(dp, port, ©); > + return 0; > + } > + error = ENOENT; > + } > } > > ofpbuf_uninit(©); > @@ -1028,23 +1046,32 @@ static void > dp_netdev_port_input(struct dp_netdev *dp, struct dp_netdev_port *port, > struct ofpbuf *packet) > { > - struct dp_netdev_flow *flow; > - struct flow key; > + bool recirculate; > + uint32_t skb_mark = 0; > + int limit = MAX_RECIRCULATION_DEPTH; > > - if (packet->size < ETH_HEADER_LEN) { > - return; > - } > - flow_extract(packet, 0, 0, NULL, port->port_no, &key); > - flow = dp_netdev_lookup_flow(dp, &key); > - if (flow) { > - dp_netdev_flow_used(flow, packet); > - dp_netdev_execute_actions(dp, packet, &key, > - flow->actions, flow->actions_len); > - dp->n_hit++; > - } else { > - dp->n_missed++; > - dp_netdev_output_userspace(dp, packet, DPIF_UC_MISS, &key, NULL); > - } > + do { > + struct dp_netdev_flow *flow; > + struct flow key; > + > + if (packet->size < ETH_HEADER_LEN) { > + return; > + } > + flow_extract(packet, 0, skb_mark, NULL, port->port_no, &key); > + flow = dp_netdev_lookup_flow(dp, &key); > + if (flow) { > + dp_netdev_flow_used(flow, packet); > + recirculate = dp_netdev_execute_actions(dp, packet, &key, > + flow->actions, > + flow->actions_len, > + &skb_mark); > + dp->n_hit++; > + } else { > + dp->n_missed++; > + dp_netdev_output_userspace(dp, packet, DPIF_UC_MISS, &key, NULL); > + recirculate = false; > + } > + } while (recirculate && limit--); > } > > static void > @@ -1163,6 +1190,7 @@ dp_netdev_sample(struct dp_netdev *dp, > const struct nlattr *subactions = NULL; > const struct nlattr *a; > size_t left; > + uint32_t skb_mark; > > NL_NESTED_FOR_EACH_UNSAFE (a, left, action) { > int type = nl_attr_type(a); > @@ -1186,7 +1214,7 @@ dp_netdev_sample(struct dp_netdev *dp, > } > > dp_netdev_execute_actions(dp, packet, key, nl_attr_get(subactions), > - nl_attr_get_size(subactions)); > + nl_attr_get_size(subactions), &skb_mark); > } > > static void > @@ -1201,7 +1229,8 @@ dp_netdev_action_userspace(struct dp_netdev *dp, > } > > static void > -execute_set_action(struct ofpbuf *packet, const struct nlattr *a) > +execute_set_action(struct ofpbuf *packet, const struct nlattr *a, > + uint32_t *skb_mark) > { > enum ovs_key_attr type = nl_attr_type(a); > const struct ovs_key_ipv4 *ipv4_key; > @@ -1211,11 +1240,14 @@ execute_set_action(struct ofpbuf *packet, const > struct nlattr *a) > > switch (type) { > case OVS_KEY_ATTR_PRIORITY: > - case OVS_KEY_ATTR_SKB_MARK: > case OVS_KEY_ATTR_TUNNEL: > /* not implemented */ > break; > > + case OVS_KEY_ATTR_SKB_MARK: > + *skb_mark = nl_attr_get_u32(a); > + break; > + > case OVS_KEY_ATTR_ETHERNET: > dp_netdev_set_dl(packet, > nl_attr_get_unspec(a, sizeof(struct ovs_key_ethernet))); > @@ -1263,11 +1295,11 @@ execute_set_action(struct ofpbuf *packet, const > struct nlattr *a) > } > } > > -static void > +static bool > dp_netdev_execute_actions(struct dp_netdev *dp, > struct ofpbuf *packet, struct flow *key, > const struct nlattr *actions, > - size_t actions_len) > + size_t actions_len, uint32_t *skb_mark) > { > const struct nlattr *a; > unsigned int left; > @@ -1305,18 +1337,23 @@ dp_netdev_execute_actions(struct dp_netdev *dp, > break; > > case OVS_ACTION_ATTR_SET: > - execute_set_action(packet, nl_attr_get(a)); > + execute_set_action(packet, nl_attr_get(a), skb_mark); > break; > > case OVS_ACTION_ATTR_SAMPLE: > dp_netdev_sample(dp, packet, key, a); > break; > > + case OVS_ACTION_ATTR_RECIRCULATE: > + return true; > + > case OVS_ACTION_ATTR_UNSPEC: > case __OVS_ACTION_ATTR_MAX: > NOT_REACHED(); > } > } > + > + return false; > } > > const struct dpif_class dpif_netdev_class = { > diff --git a/lib/flow.h b/lib/flow.h > index 6e169d6..66f89e3 100644 > --- a/lib/flow.h > +++ b/lib/flow.h > @@ -296,4 +296,7 @@ uint32_t minimask_hash(const struct minimask *, uint32_t > basis); > bool minimask_has_extra(const struct minimask *, const struct minimask *); > bool minimask_is_catchall(const struct minimask *); > > +#define MAX_RECIRCULATION_DEPTH 4 /* Completely arbitrary value to > + * guard against infinite loops */ > + > #endif /* flow.h */ > diff --git a/lib/odp-util.c b/lib/odp-util.c > index 1a3e386..8565269 100644 > --- a/lib/odp-util.c > +++ b/lib/odp-util.c > @@ -75,6 +75,7 @@ odp_action_len(uint16_t type) > case OVS_ACTION_ATTR_POP_VLAN: return 0; > case OVS_ACTION_ATTR_PUSH_MPLS: return sizeof(struct > ovs_action_push_mpls); > case OVS_ACTION_ATTR_POP_MPLS: return sizeof(ovs_be16); > + case OVS_ACTION_ATTR_RECIRCULATE: return 0; > case OVS_ACTION_ATTR_SET: return -2; > case OVS_ACTION_ATTR_SAMPLE: return -2; > > @@ -376,6 +377,10 @@ format_odp_action(struct ds *ds, const struct nlattr *a) > ds_put_format(ds, "pop_mpls(eth_type=0x%"PRIx16")", > ntohs(ethertype)); > break; > } > + case OVS_ACTION_ATTR_RECIRCULATE: { > + ds_put_format(ds, "recirculate"); > + break; > + } > case OVS_ACTION_ATTR_SAMPLE: > format_odp_sample_action(ds, a); > break; > @@ -2171,6 +2176,12 @@ commit_odp_tunnel_action(const struct flow *flow, > struct flow *base, > } > } > > +void > +commit_odp_recirculate_action(struct ofpbuf *odp_actions) > +{ > + nl_msg_put_flag(odp_actions, OVS_ACTION_ATTR_RECIRCULATE); > +} > + > static void > commit_set_ether_addr_action(const struct flow *flow, struct flow *base, > struct ofpbuf *odp_actions) > @@ -2384,14 +2395,14 @@ commit_set_skb_mark_action(const struct flow *flow, > struct flow *base, > return; > } > base->skb_mark = flow->skb_mark; > - > odp_put_skb_mark_action(base->skb_mark, odp_actions); > } > /* If any of the flow key data that ODP actions can modify are different in > * 'base' and 'flow', appends ODP actions to 'odp_actions' that change the > flow > * key from 'base' into 'flow', and then changes 'base' the same way. Does > not > * commit set_tunnel actions. Users should call commit_odp_tunnel_action() > - * in addition to this function if needed. */ > + * and commit_odp_recirculate_action() in addition to those functions are > + * needed. */ > void > commit_odp_actions(const struct flow *flow, struct flow *base, > struct ofpbuf *odp_actions) > diff --git a/lib/odp-util.h b/lib/odp-util.h > index ad0fb30..da62aa5 100644 > --- a/lib/odp-util.h > +++ b/lib/odp-util.h > @@ -115,6 +115,7 @@ const char *odp_key_fitness_to_string(enum > odp_key_fitness); > > void commit_odp_tunnel_action(const struct flow *, struct flow *base, > struct ofpbuf *odp_actions); > +void commit_odp_recirculate_action(struct ofpbuf *odp_actions); > void commit_odp_actions(const struct flow *, struct flow *base, > struct ofpbuf *odp_actions); > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 6e9023a..226eda6 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -119,7 +119,8 @@ static struct rule_dpif *rule_dpif_miss_rule(struct > ofproto_dpif *ofproto, > > static void rule_credit_stats(struct rule_dpif *, > const struct dpif_flow_stats *); > -static void flow_push_stats(struct facet *, const struct dpif_flow_stats *); > +static void flow_push_stats(struct facet *, const struct dpif_flow_stats *, > + const struct ofpact *, size_t ofpacts_len); > static tag_type rule_calculate_tag(const struct flow *, > const struct minimask *, uint32_t basis); > static void rule_invalidate(const struct rule_dpif *); > @@ -276,6 +277,16 @@ struct action_xlate_ctx { > uint16_t nf_output_iface; /* Output interface index for NetFlow. */ > mirror_mask_t mirrors; /* Bitmap of associated mirrors. */ > > + size_t ofpacts_len; /* The number of bytes of the ofpacts > + * argument to xlate_actions() processed > + * by it. This is used to calculate an > + * offset into ofpacts for calls to > + * xlate_actions on recirculated packets */ > + > + uint32_t recirculation_id; /* skb_mark to use to identify recirculation. > + * Zero if the context does not add a > + * recirculate action. */ > + > /* xlate_actions() initializes and uses these members, but the client has no > * reason to look at them. */ > > @@ -312,7 +323,8 @@ static void action_xlate_ctx_init(struct action_xlate_ctx > *, > struct ofproto_dpif *, const struct flow *, > const struct initial_vals *initial_vals, > struct rule_dpif *, > - uint8_t tcp_flags, const struct ofpbuf *); > + uint8_t tcp_flags, const struct ofpbuf *, > + uint32_t recirculation_id); > static void xlate_actions(struct action_xlate_ctx *, > const struct ofpact *ofpacts, size_t ofpacts_len, > struct ofpbuf *odp_actions); > @@ -491,13 +503,38 @@ struct facet { > * always be valid, since it could have been removed after newer > * subfacets were pushed onto the 'subfacets' list.) */ > struct subfacet one_subfacet; > + > + const struct ofpact *ofpacts; /* ofpacts for this facet. > + * Will differ from rule->up.ofpacts > + * if facet is for a recirculated > packet. */ > + size_t ofpacts_len; /* ofpacts_len for this facet > + * Will differ from * > rule->up.ofpacts_len > + * if facet is for a recirculated > packet. */ > + > + uint32_t recirculation_id; /* Recirculation id. > + * Non-sero for a facet > + * that recirculates packets; > + * used as the value of flow.skb_mark > + * in the facet of recirculated packets. > + * Zero otherwise. */ > + const struct ofpact *recirculation_ofpacts; > + /* ofpacts for facets of packets > + * recirculated by this facet */ > + size_t recirculation_ofpacts_len; > + /* ofpacts_len for facets of packets > + * recirculated by this facet */ > + > + bool recirculated; /* Facet of a recirculated packet? */ > + > }; > > -static struct facet *facet_create(struct rule_dpif *, > - const struct flow *, uint32_t hash); > +static struct facet *facet_create(struct rule_dpif *, const struct flow *, > + const struct ofpact *, size_t ofpacts_len, > + bool recirculated, uint32_t hash); > static void facet_remove(struct facet *); > static void facet_free(struct facet *); > > +static struct facet *facet_find_by_id(struct ofproto_dpif *, ovs_be32 id); > static struct facet *facet_find(struct ofproto_dpif *, > const struct flow *, uint32_t hash); > static struct facet *facet_lookup_valid(struct ofproto_dpif *, > @@ -3434,6 +3471,15 @@ static bool > flow_miss_should_make_facet(struct ofproto_dpif *ofproto, > struct flow_miss *miss, uint32_t hash) > { > + /* A facet is currently required to handle recirculation. > + * There currently isn't a good way to detect if recirculation will > + * occur or not. So in the mean time assume that it can't occur > + * for non-MPLS packets and it may occur for MPLS packets > + */ > + if (eth_type_mpls(miss->flow.dl_type)) { > + return true; > + } > + > if (!ofproto->governor) { > size_t n_subfacets; > > @@ -3449,12 +3495,26 @@ flow_miss_should_make_facet(struct ofproto_dpif > *ofproto, > list_size(&miss->packets)); > } > > +/* XXX: This is just a stub */ > +static uint32_t get_recirculate_id(void) > +{ > + static uint32_t id = 0; > + /* Skip 0, it is reserved as a null value */ > + if (!id) > + id++; > + /* Skip IPSEC_MARK bit it is reserved */ > + if (id & IPSEC_MARK) > + id++; > + ovs_assert(!(id & IPSEC_MARK)); > + return id++; > +} > + > /* Handles 'miss', which matches 'rule', without creating a facet or subfacet > * or creating any datapath flow. May add an "execute" operation to 'ops' > and > * increment '*n_ops'. */ > static void > -handle_flow_miss_without_facet(struct flow_miss *miss, > - struct rule_dpif *rule, > +handle_flow_miss_without_facet(struct flow_miss *miss, struct rule_dpif > *rule, > + const struct ofpact *ofpacts, size_t > ofpacts_len, > struct flow_miss_op *ops, size_t *n_ops) > { > struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto); > @@ -3475,10 +3535,9 @@ handle_flow_miss_without_facet(struct flow_miss *miss, > rule_credit_stats(rule, &stats); > > action_xlate_ctx_init(&ctx, ofproto, &miss->flow, > - &miss->initial_vals, rule, 0, packet); > + &miss->initial_vals, rule, 0, packet, 0); > ctx.resubmit_stats = &stats; > - xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, > - &odp_actions); > + xlate_actions(&ctx, ofpacts, ofpacts_len, &odp_actions); > > if (odp_actions.size) { > struct dpif_execute *execute = &op->dpif_op.u.execute; > @@ -3592,14 +3651,30 @@ handle_flow_miss(struct flow_miss *miss, struct > flow_miss_op *ops, > > facet = facet_lookup_valid(ofproto, &miss->flow, hash); > if (!facet) { > - struct rule_dpif *rule = rule_dpif_lookup(ofproto, &miss->flow); > + struct rule_dpif *rule; > + const struct ofpact *ofpacts; > + size_t ofpacts_len; > + struct facet *parent_facet; > + > + parent_facet = facet_find_by_id(ofproto, miss->flow.skb_mark); > + if (parent_facet) { > + rule = parent_facet->rule; > + ofpacts = parent_facet->recirculation_ofpacts; > + ofpacts_len = parent_facet->recirculation_ofpacts_len; > + } else { > + rule = rule_dpif_lookup(ofproto, &miss->flow); > + ofpacts = rule->up.ofpacts; > + ofpacts_len = rule->up.ofpacts_len; > + } > > if (!flow_miss_should_make_facet(ofproto, miss, hash)) { > - handle_flow_miss_without_facet(miss, rule, ops, n_ops); > + handle_flow_miss_without_facet(miss, rule, ofpacts, > + ofpacts_len, ops, n_ops); > return; > } > > - facet = facet_create(rule, &miss->flow, hash); > + facet = facet_create(rule, &miss->flow, ofpacts, ofpacts_len, > + parent_facet != NULL, hash); > now = facet->used; > } else { > now = time_msec(); > @@ -4345,6 +4420,28 @@ rule_expire(struct rule_dpif *rule) > > /* Facets. */ > > +/* XXX: This is just a stub */ > +static struct facet * > +facet_find_by_id(struct ofproto_dpif *ofproto, uint32_t id) > +{ > + struct facet *facet; > + > + /* id=0 is never used */ > + if (!id) > + return NULL; > + > + /* This is a ridiculous way to look things up, most likely the id > + * should be cooked somehow to allow a more efficient lookup. > + */ > + HMAP_FOR_EACH(facet, hmap_node, &ofproto->facets) { > + if (facet->recirculation_id == id) { > + return facet; > + } > + } > + > + return NULL; > +} > + > /* Creates and returns a new facet owned by 'rule', given a 'flow'. > * > * The caller must already have determined that no facet with an identical > @@ -4356,7 +4453,9 @@ rule_expire(struct rule_dpif *rule) > * The facet will initially have no subfacets. The caller should create (at > * least) one subfacet with subfacet_create(). */ > static struct facet * > -facet_create(struct rule_dpif *rule, const struct flow *flow, uint32_t hash) > +facet_create(struct rule_dpif *rule, const struct flow *flow, > + const struct ofpact *ofpacts, size_t ofpacts_len, > + bool recirculated, uint32_t hash) > { > struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto); > struct facet *facet; > @@ -4367,6 +4466,9 @@ facet_create(struct rule_dpif *rule, const struct flow > *flow, uint32_t hash) > list_push_back(&rule->facets, &facet->list_node); > facet->rule = rule; > facet->flow = *flow; > + facet->ofpacts = ofpacts; > + facet->ofpacts_len = ofpacts_len; > + facet->recirculated = recirculated; > list_init(&facet->subfacets); > netflow_flow_init(&facet->nf_flow); > netflow_flow_update_time(ofproto->netflow, &facet->nf_flow, facet->used); > @@ -4456,10 +4558,10 @@ facet_learn(struct facet *facet) > > action_xlate_ctx_init(&ctx, ofproto, &facet->flow, > &subfacet->initial_vals, > - facet->rule, facet->tcp_flags, NULL); > + facet->rule, facet->tcp_flags, NULL, > + facet->recirculation_id); > ctx.may_learn = true; > - xlate_actions_for_side_effects(&ctx, facet->rule->up.ofpacts, > - facet->rule->up.ofpacts_len); > + xlate_actions_for_side_effects(&ctx, facet->ofpacts, facet->ofpacts_len); > } > > static void > @@ -4682,6 +4784,12 @@ facet_check_consistency(struct facet *facet) > bool may_log = false; > bool ok; > > + if (facet->recirculated) { > + /* XXX: This is a hack and needs to be replaced with facet > + * validation for facets of recirculated packets */ > + return true; > + } > + > /* Check the rule for consistency. */ > rule = rule_dpif_lookup(ofproto, &facet->flow); > ok = rule == facet->rule; > @@ -4713,7 +4821,8 @@ facet_check_consistency(struct facet *facet) > struct ds s; > > action_xlate_ctx_init(&ctx, ofproto, &facet->flow, > - &subfacet->initial_vals, rule, 0, NULL); > + &subfacet->initial_vals, rule, 0, NULL, > + facet->recirculation_id); > xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, > &odp_actions); > > @@ -4844,7 +4953,8 @@ facet_revalidate(struct facet *facet) > enum slow_path_reason slow; > > action_xlate_ctx_init(&ctx, ofproto, &facet->flow, > - &subfacet->initial_vals, new_rule, 0, NULL); > + &subfacet->initial_vals, new_rule, 0, NULL, > + facet->recirculation_id); > xlate_actions(&ctx, new_rule->up.ofpacts, new_rule->up.ofpacts_len, > &odp_actions); > > @@ -4942,11 +5052,13 @@ facet_push_stats(struct facet *facet) > stats.tcp_flags = 0; > > if (stats.n_packets || stats.n_bytes || facet->used > facet->prev_used) { > + > facet->prev_packet_count = facet->packet_count; > facet->prev_byte_count = facet->byte_count; > facet->prev_used = facet->used; > > - flow_push_stats(facet, &stats); > + flow_push_stats(facet, &stats, > + facet->ofpacts, facet->ofpacts_len); > > update_mirror_stats(ofproto_dpif_cast(facet->rule->up.ofproto), > facet->mirrors, stats.n_packets, stats.n_bytes); > @@ -4964,7 +5076,8 @@ rule_credit_stats(struct rule_dpif *rule, const struct > dpif_flow_stats *stats) > /* Pushes flow statistics to the rules which 'facet->flow' resubmits > * into given 'facet->rule''s actions and mirrors. */ > static void > -flow_push_stats(struct facet *facet, const struct dpif_flow_stats *stats) > +flow_push_stats(struct facet *facet, const struct dpif_flow_stats *stats, > + const struct ofpact *ofpacts, size_t ofpacts_len) > { > struct rule_dpif *rule = facet->rule; > struct ofproto_dpif *ofproto = ofproto_dpif_cast(rule->up.ofproto); > @@ -4974,10 +5087,11 @@ flow_push_stats(struct facet *facet, const struct > dpif_flow_stats *stats) > ofproto_rule_update_used(&rule->up, stats->used); > > action_xlate_ctx_init(&ctx, ofproto, &facet->flow, > - &subfacet->initial_vals, rule, 0, NULL); > + &subfacet->initial_vals, rule, 0, NULL, > + facet->recirculation_id); > ctx.resubmit_stats = stats; > - xlate_actions_for_side_effects(&ctx, rule->up.ofpacts, > - rule->up.ofpacts_len); > + > + xlate_actions_for_side_effects(&ctx, ofpacts, ofpacts_len); > } > > /* Subfacets. */ > @@ -5130,14 +5244,22 @@ subfacet_make_actions(struct subfacet *subfacet, > const struct ofpbuf *packet, > struct action_xlate_ctx ctx; > > action_xlate_ctx_init(&ctx, ofproto, &facet->flow, > - &subfacet->initial_vals, rule, 0, packet); > - xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, odp_actions); > + &subfacet->initial_vals, rule, 0, packet, > + facet->recirculation_id); > + xlate_actions(&ctx, facet->ofpacts, facet->ofpacts_len, odp_actions); > facet->tags = ctx.tags; > facet->has_learn = ctx.has_learn; > facet->has_normal = ctx.has_normal; > facet->has_fin_timeout = ctx.has_fin_timeout; > facet->nf_flow.output_iface = ctx.nf_output_iface; > facet->mirrors = ctx.mirrors; > + if (ctx.recirculation_id) { > + facet->recirculation_id = ctx.recirculation_id; > + facet->recirculation_ofpacts = ofpact_end(rule->up.ofpacts, > + ctx.ofpacts_len); > + facet->recirculation_ofpacts_len = > + rule->up.ofpacts_len - ctx.ofpacts_len; > + } > > subfacet->slow = (subfacet->slow & SLOW_MATCH) | ctx.slow; > if (subfacet->actions_len != odp_actions->size > @@ -5461,7 +5583,7 @@ rule_dpif_execute(struct rule_dpif *rule, const struct > flow *flow, > initial_vals.tunnel_ip_tos = flow->tunnel.ip_tos; > ofpbuf_use_stub(&odp_actions, odp_actions_stub, sizeof odp_actions_stub); > action_xlate_ctx_init(&ctx, ofproto, flow, &initial_vals, > - rule, stats.tcp_flags, packet); > + rule, stats.tcp_flags, packet, 0); > ctx.resubmit_stats = &stats; > xlate_actions(&ctx, rule->up.ofpacts, rule->up.ofpacts_len, > &odp_actions); > > @@ -6143,6 +6265,15 @@ execute_dec_mpls_ttl_action(struct action_xlate_ctx > *ctx) > } > > static void > +execute_recircualte_action(struct action_xlate_ctx *ctx) > +{ > + if (!ctx->recirculation_id) { > + ctx->recirculation_id = get_recirculate_id(); > + } > + ctx->flow.skb_mark = ctx->recirculation_id; > +} > + > +static void > xlate_output_action(struct action_xlate_ctx *ctx, > uint16_t port, uint16_t max_len, bool may_packet_in) > { > @@ -6383,6 +6514,7 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > struct action_xlate_ctx *ctx) > { > bool was_evictable = true; > + bool may_recirculate = false; > const struct ofpact *a; > > if (ctx->rule) { > @@ -6451,18 +6583,30 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > > case OFPACT_SET_IPV4_SRC: > if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) { > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } > ctx->flow.nw_src = ofpact_get_SET_IPV4_SRC(a)->ipv4; > } > break; > > case OFPACT_SET_IPV4_DST: > if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) { > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } > ctx->flow.nw_dst = ofpact_get_SET_IPV4_DST(a)->ipv4; > } > break; > > case OFPACT_SET_IPV4_DSCP: > /* OpenFlow 1.0 only supports IPv4. */ > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } > if (ctx->flow.dl_type == htons(ETH_TYPE_IP)) { > ctx->flow.nw_tos &= ~IP_DSCP_MASK; > ctx->flow.nw_tos |= ofpact_get_SET_IPV4_DSCP(a)->dscp; > @@ -6471,12 +6615,20 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > > case OFPACT_SET_L4_SRC_PORT: > if (is_ip_any(&ctx->flow)) { > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } > ctx->flow.tp_src = > htons(ofpact_get_SET_L4_SRC_PORT(a)->port); > } > break; > > case OFPACT_SET_L4_DST_PORT: > if (is_ip_any(&ctx->flow)) { > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } > ctx->flow.tp_dst = > htons(ofpact_get_SET_L4_DST_PORT(a)->port); > } > break; > @@ -6517,10 +6669,15 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > > case OFPACT_PUSH_MPLS: > execute_mpls_push_action(ctx, > ofpact_get_PUSH_MPLS(a)->ethertype); > + may_recirculate = false; > break; > > case OFPACT_POP_MPLS: > execute_mpls_pop_action(ctx, ofpact_get_POP_MPLS(a)->ethertype); > + if (ctx->flow.dl_type == htons(ETH_TYPE_IP) || > + ctx->flow.dl_type == htons(ETH_TYPE_IPV6)) { > + may_recirculate = true; > + } > break; > > case OFPACT_SET_MPLS_TTL: > @@ -6536,7 +6693,10 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > break; > > case OFPACT_DEC_TTL: > - if (compose_dec_ttl(ctx, ofpact_get_DEC_TTL(a))) { > + if (may_recirculate) { > + execute_recircualte_action(ctx); > + goto out; > + } else if (compose_dec_ttl(ctx, ofpact_get_DEC_TTL(a))) { > goto out; > } > break; > @@ -6623,6 +6783,7 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > } > > out: > + ctx->ofpacts_len = (char *)(a) - (char *)ofpacts; > if (ctx->rule) { > ctx->rule->up.evictable = was_evictable; > } > @@ -6633,7 +6794,8 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx, > struct ofproto_dpif *ofproto, const struct flow *flow, > const struct initial_vals *initial_vals, > struct rule_dpif *rule, > - uint8_t tcp_flags, const struct ofpbuf *packet) > + uint8_t tcp_flags, const struct ofpbuf *packet, > + uint32_t recirculation_id) > { > ovs_be64 initial_tun_id = flow->tunnel.tun_id; > > @@ -6656,7 +6818,13 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx, > * registers. > * - Tunnel 'base_flow' is completely cleared since that is what the > * kernel does. If we wish to maintain the original values an action > - * needs to be generated. */ > + * needs to be generated. > + * - The recirculation_id element of flow and base flow are set to > + * recirculate_id, which is the id that will be used by a recirculation > + * action of one is added. It is stored in flow and base_flow for > + * convenience as the recirculation_id element of flow and base flow > + * are otherwise unused by action_xlate_ctx_init(). > + */ > > ctx->ofproto = ofproto; > ctx->flow = *flow; > @@ -6672,6 +6840,7 @@ action_xlate_ctx_init(struct action_xlate_ctx *ctx, > ctx->resubmit_hook = NULL; > ctx->report_hook = NULL; > ctx->resubmit_stats = NULL; > + ctx->recirculation_id = recirculation_id; > } > > /* Translates the 'ofpacts_len' bytes of "struct ofpacts" starting at > 'ofpacts' > @@ -6756,6 +6925,11 @@ xlate_actions(struct action_xlate_ctx *ctx, > > if (tunnel_ecn_ok(ctx) && (!in_port || may_receive(in_port, ctx))) { > do_xlate_actions(ofpacts, ofpacts_len, ctx); > + if (ctx->recirculation_id) { > + commit_odp_actions(&ctx->flow, &ctx->base_flow, > + ctx->odp_actions); > + commit_odp_recirculate_action(odp_actions); > + } > > /* We've let OFPP_NORMAL and the learning action look at the > * packet, so drop it now if forwarding is disabled. */ > @@ -7515,7 +7689,7 @@ packet_out(struct ofproto *ofproto_, struct ofpbuf > *packet, > initial_vals.vlan_tci = flow->vlan_tci; > initial_vals.tunnel_ip_tos = 0; > action_xlate_ctx_init(&ctx, ofproto, flow, &initial_vals, NULL, > - packet_get_tcp_flags(packet, flow), packet); > + packet_get_tcp_flags(packet, flow), packet, 0); > ctx.resubmit_stats = &stats; > > ofpbuf_use_stub(&odp_actions, > @@ -7900,7 +8074,7 @@ ofproto_trace(struct ofproto_dpif *ofproto, const > struct flow *flow, > ofpbuf_use_stub(&odp_actions, > odp_actions_stub, sizeof odp_actions_stub); > action_xlate_ctx_init(&trace.ctx, ofproto, flow, initial_vals, > - rule, tcp_flags, packet); > + rule, tcp_flags, packet, 0); > trace.ctx.resubmit_hook = trace_resubmit; > trace.ctx.report_hook = trace_report; > xlate_actions(&trace.ctx, rule->up.ofpacts, rule->up.ofpacts_len, > -- > 1.7.10.4 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev