On Tue, Jun 11, 2013 at 2:26 PM, Jesse Gross <je...@nicira.com> wrote: > On Mon, Jun 10, 2013 at 1:07 AM, Simon Horman <ho...@verge.net.au> wrote: >> diff --git a/datapath/actions.c b/datapath/actions.c >> index 0dac658..197811a 100644 >> --- a/datapath/actions.c >> +++ b/datapath/actions.c >> +static int push_mpls(struct sk_buff *skb, >> + const struct ovs_action_push_mpls *mpls) >> +{ >> + __be32 *new_mpls_lse; >> + int err; >> + >> + if (skb_cow_head(skb, MPLS_HLEN) < 0) >> + return -ENOMEM; >> + >> + err = make_writable(skb, skb->mac_len); >> + if (unlikely(err)) >> + return err; >> + >> + skb_push(skb, MPLS_HLEN); >> + memmove(skb_mac_header(skb) - MPLS_HLEN, skb_mac_header(skb), >> + skb->mac_len); >> + skb_reset_mac_header(skb); >> + >> + new_mpls_lse = (__be32 *)mac_header_end(skb); >> + *new_mpls_lse = mpls->mpls_lse; >> + >> + if (get_ip_summed(skb) == OVS_CSUM_COMPLETE) >> + skb->csum = csum_add(skb->csum, csum_partial(new_mpls_lse, >> + MPLS_HLEN, 0)); >> + >> + /* If the skb was non-MPLS then record the original protocol >> + * in skb->inner_protocol. This is to allow GSO to perform >> + * segmentation of the inner-packet. >> + */ >> + if (!eth_p_mpls(skb->protocol)) >> + skb_set_inner_protocol(skb, skb->protocol); >> + >> + set_ethertype(skb, mpls->mpls_ethertype); >> + if (!vlan_tx_tag_present(skb) && skb->protocol != >> htons(ETH_P_8021Q)) { >> + skb->protocol = mpls->mpls_ethertype; >> + } > > I don't think you need to check for vlan_tx_tag_present here - when > doing vlan offloading, skb->protocol reflects the packet as if there > was no vlan tag there at all. Also, you shouldn't need the braces > there. > > When looking at this, I had an additional thought: different versions > of OpenFlow define the tag order differently and there is some limited > flexibility in how they get applied. Obviously, we're assuming that > vlans are applied before MPLS here. However, I'm not sure that is > actually required and may even simplify things (for example, your MPLS > GSO code doesn't do anything special with vlans and it should continue > to just work either way). I'm hesitant to suggest changes at this > point but it might actually make things cleaner and work with the > general model that we already have. > >> diff --git a/datapath/datapath.c b/datapath/datapath.c >> index 42af315..7e0e7a1 100644 >> --- a/datapath/datapath.c >> +++ b/datapath/datapath.c >> @@ -551,18 +551,132 @@ static inline void add_nested_action_end(struct >> sw_flow_actions *sfa, int st_off >> a->nla_len = sfa->actions_len - st_offset; >> } >> >> -static int validate_and_copy_actions(const struct nlattr *attr, >> +#define MAX_ETH_TYPES 16 /* Arbitrary Limit */ >> + >> +/* struct eth_types - possible eth types >> + * @types: provides storage for the possible eth types. >> + * @start: is the index of the first entry of types which is possible. >> + * @end: is the index of the last entry of types which is possible. >> + * @cursor: is the index of the entry which should be updated if an action >> + * changes the eth type. >> + * >> + * Due to the sample action there may be multiple possible eth types. >> + * In order to correctly validate actions all possible types are tracked >> + * and verified. This is done using struct eth_types. >> + * >> + * Initially start, end and cursor should be 0, and the first element of >> + * types should be set to the eth type of the flow. >> + * >> + * When an action changes the eth type then the values of start and end are >> + * updated to the value of cursor. The new type is stored at types[cursor]. >> + * >> + * When entering a sample action the start and cursor values are saved. The >> + * value of cursor is set to the value of end plus one. >> + * >> + * When leaving a sample action the start and cursor values are restored to >> + * their saved values. >> + * >> + * An example follows. >> + * >> + * actions: pop_mpls(A),sample(pop_mpls(B)),sample(pop_mpls(C)),pop_mpls(D) >> + * >> + * 0. Initial state: >> + * types = { original_eth_type } >> + * start = end = cursor = 0; >> + * >> + * 1. pop_mpls(A) >> + * a. Check types from start (0) to end (0) inclusive >> + * i.e. Check against original_eth_type >> + * b. Set start = end = cursor >> + * c. Set types[cursor] = A >> + * New state: >> + * types = { A } >> + * start = end = cursor = 0; >> + * >> + * 2. Enter first sample() >> + * a. Save start and cursor >> + * b. Set cursor = end + 1 >> + * New state: >> + * types = { A } >> + * start = end = 0; >> + * cursor = 1; >> + * >> + * 3. pop_mpls(B) >> + * a. Check types from start (0) to end (0) >> + * i.e: Check against A >> + * b. Set start = end = cursor >> + * c. Set types[cursor] = B >> + * New state: >> + * types = { A, B } >> + * start = end = cursor = 1; >> + * >> + * 4. Leave first sample() >> + * a. Restore start and cursor to the values when entering 2. >> + * New state: >> + * types = { A, B } >> + * start = cursor = 0; >> + * end = 1; >> + * >> + * 5. Enter second sample() >> + * a. Save start and cursor >> + * b. Set cursor = end + 1 >> + * New state: >> + * types = { A, B } >> + * start = 0; >> + * end = 1; >> + * cursor = 2; >> + * >> + * 6. pop_mpls(C) >> + * a. Check types from start (0) to end (1) inclusive >> + * i.e: Check against A and B >> + * b. Set start = end = cursor >> + * c. Set types[cursor] = C >> + * New state: >> + * types = { A, B, C } >> + * start = end = cursor = 2; >> + * >> + * 7. Leave second sample() >> + * a. Restore start and cursor to the values when entering 5. >> + * New state: >> + * types = { A, B, C } >> + * start = cursor = 0; >> + * end = 2; >> + * >> + * 8. pop_mpls(D) >> + * a. Check types from start (0) to end (2) inclusive >> + * i.e: Check against A, B and C >> + * b. Set start = end = cursor >> + * c. Set types[cursor] = D >> + * New state: >> + * types = { D } // Trailing entries of type are no longer used end = 0 >> + * start = end = cursor = 0; >> + */ > > The algorithm looks good to me now, thanks for working through it and > for documenting it. > >> +static int validate_and_copy_actions__(const struct nlattr *attr, >> + const struct sw_flow_key *key, int depth, >> + struct sw_flow_actions **sfa, >> + struct eth_types *eth_types) > [...] >> + case OVS_ACTION_ATTR_POP_MPLS: { >> + size_t i; > > size_t seems a little bit odd to me here, maybe just use an int? > >> + for (i = eth_types->start; i <= eth_types->end; i++) >> + if (!eth_p_mpls(eth_types->types[i])) >> + return -EINVAL; >> + >> + /* Disallow subsequent L2.5+ set and mpls_pop actions >> + * as there is no check here to ensure that the new >> + * eth_type is valid and thus set actions could >> + * write off the end of the packet or otherwise >> + * corrupt it. >> + * >> + * Support for these actions that after mpls_pop >> + * using packet recirculation is planned. >> + * are planned to be supported using using packet >> + * recirculation. > > I think there is an editing error in the above comment. > >> diff --git a/datapath/datapath.h b/datapath/datapath.h >> index ad59a3a..e47493a 100644 >> --- a/datapath/datapath.h >> +++ b/datapath/datapath.h >> +#ifdef HAVE_INNER_PROTOCOL >> +static inline void skb_set_inner_protocol(struct sk_buff *skb, __be16 >> ethertype) >> +{ >> + skb->inner_protocol = ethertype; >> +} >> + >> +static inline __be16 skb_get_inner_protocol(struct sk_buff *skb) >> +{ >> + return skb->inner_protocol; >> +} >> + >> +static inline bool kernel_supports_mpls_gso(void) >> +{ >> + return true; >> +} >> +#else >> +static inline void skb_set_inner_protocol(struct sk_buff *skb, __be16 >> ethertype) { >> + OVS_CB(skb)->inner_protocol = ethertype; >> +} >> + >> +static inline __be16 skb_get_inner_protocol(struct sk_buff *skb) >> +{ >> + return OVS_CB(skb)->inner_protocol; >> +} >> + >> +static inline bool kernel_supports_mpls_gso(void) >> +{ >> + return false; >> +} >> +#endif > > Can we move these functions compat.h, prefix them with ovs_, and make > it clear that they are part of OVS and not skbuff.h? > >> diff --git a/datapath/tunnel.c b/datapath/tunnel.c >> index 8c93e18..2d870aa 100644 >> --- a/datapath/tunnel.c >> +++ b/datapath/tunnel.c >> @@ -150,6 +150,16 @@ static struct sk_buff *handle_offloads(struct sk_buff >> *skb) >> >> if (skb_is_gso(skb)) { >> struct sk_buff *nskb; >> + __be16 mpls_protocol = htons(0); >> + >> + /* Swap the protocol so we can reuse the existing >> + * skb_gso_segment() function to handle L3 GSO. We will >> + * restore this later. */ >> + if (!kernel_supports_mpls_gso() && eth_p_mpls(skb->protocol) >> && >> + !eth_p_mpls(skb_get_inner_protocol(skb))) { >> + mpls_protocol = skb->protocol; >> + skb->protocol = skb_get_inner_protocol(skb); >> + } >> >> nskb = __skb_gso_segment(skb, 0, false); >> if (IS_ERR(nskb)) { >> @@ -159,6 +169,13 @@ static struct sk_buff *handle_offloads(struct sk_buff >> *skb) >> >> consume_skb(skb); >> skb = nskb; >> + >> + if (mpls_protocol != htons(0)) { >> + do { >> + nskb->protocol = mpls_protocol; >> + nskb = nskb->next; >> + } while (nskb); >> + } >> } else if (get_ip_summed(skb) == OVS_CSUM_PARTIAL) { >> /* Pages aren't locked and could change at any time. >> * If this happens after we compute the checksum, the > > Can we push this completely into the skb_gso_segment() compatibility > code? It's both nicer and may make the interactions with the vlan code > less confusing. > >> diff --git a/datapath/vport-netdev.c b/datapath/vport-netdev.c >> index 4e7342c..db3284f 100644 >> --- a/datapath/vport-netdev.c >> +++ b/datapath/vport-netdev.c >> @@ -309,8 +310,31 @@ static int netdev_send(struct vport *vport, struct >> sk_buff *skb) >> skb->dev = netdev_vport->dev; >> forward_ip_summed(skb, true); >> >> + vlan = mpls = false; >> + >> + if (!kernel_supports_mpls_gso() && eth_p_mpls(skb->protocol) && >> + !eth_p_mpls(skb_get_inner_protocol(skb))) >> + mpls = true; >> + >> + /* If we are segmenting a VLAN packet, then we don't need to handle >> + * MPLS segmentation, because MPLS is part of the extended L2 header >> + * and the kernel already knows how to handle this. */ >> if (vlan_tx_tag_present(skb) && !dev_supports_vlan_tx(skb->dev)) { >> - int features; >> + mpls = false; >> + vlan = true; >> + } > > Don't we still need to save and restore the MPLS skb->protocol field > even if we are also processing vlans? > >> @@ -341,10 +376,14 @@ static int netdev_send(struct vport *vport, struct >> sk_buff *skb) >> nskb = skb->next; >> skb->next = NULL; >> >> - skb = __vlan_put_tag(skb, >> vlan_tx_tag_get(skb)); >> + if (vlan) >> + skb = __vlan_put_tag(skb, >> vlan_tx_tag_get(skb)); >> if (likely(skb)) { >> len += skb->len; >> - vlan_set_tci(skb, 0); >> + if (mpls) >> + skb->protocol = >> mpls_protocol; >> + if (vlan) >> + vlan_set_tci(skb, 0); >> dev_queue_xmit(skb); >> } > > The order of setting the protocol field is different from the one > below and I think this one isn't right because it means that the MPLS > protocol can override the vlan. > > Pravin, can you take a look at this as well since you are working in > the same area?
I think this compatibility code can be moved to new compat code posted on gre-restructuring code few changes. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev