On May 2, 2012, at 7:41 AM, Simon Horman wrote:
> On Wed, May 02, 2012 at 06:19:58PM +0900, Simon Horman wrote:
>> On Tue, May 01, 2012 at 04:50:49PM -0700, Jesse Gross wrote:
>>> On Mon, Apr 30, 2012 at 5:13 PM, Simon Horman <ho...@verge.net.au> wrote:
>>>> this is a first pass at providing a tun_key which can be used
>>>> as the basis for flow-based tunnelling. The tun_key includes and
>>>> replaces the tun_id in both struct ovs_skb_cb and struct sw_tun_key.
>>> 
>>> Thanks, this looks like a good start to me.
>> 
>> Thanks, its nice to get off to a good start :)
>> 
>>>> A significant problem with the current code, which causes it to fail to
>>>> compile, is that struct ovs_key_ipv6_tunnel is itself 48 bytes and that
>>>> as a result ovs_skb_cb is larger than 48 bytes. This means that it will
>>>> not fit inside skb->cb which is also 48 bytes.
>>> 
>>> I think it's possible to avoid this problem by just storing a pointer
>>> to the appropriate tunnel data since it exists in memory already.  On
>>> the output side, rather than copying the data from the flow you could
>>> just have a pointer into the flow data directly.  The flow is
>>> guaranteed to stick around for the lifetime of the skb due to RCU.  On
>>> the receive side, it could just be a pointer to some data on the stack
>>> that contains the appropriate struct filled with information.
>> 
>> I have attempted to implement that scheme in the revised patch below.
>> I have not actually testsed the patch (partly because it breaks tunnelling
>> in its current form) and I do suspect I may have missed a few things
>> leading to invalid accesses.
>> 
>>>> At this point it may be just as well just to drop the IPv6 portions of this
>>>> change, as far as I know OVS has not supported IPv6 as the outer transport
>>>> protocol for tunnelled frames. However it does seem to be a problem that
>>>> will arise at some point.
>>> 
>>> It's good to think about IPv6 but since OVS doesn't currently support
>>> tunnels over IPv6 I wouldn't worry about it much.  I think the overall
>>> change will be big enough that we don't want to add extra work if we
>>> can avoid it.
>> 
>> Agreed. I think it was a worthwhile exrecise to add the IPv6 code at
>> the time I did as it highlighed the skb-sb size problem. But I agree
>> it is adding extra work at this point. I have dropped all the IPv6 code
>> for now.
>> 
>>>> a pointer and set skb->destructor() to free it as needed. I am, however,
>>>> concerned about the complexity and performance penalty this may introduce.
>>>> Moreover, I'd like some review on the merit of stuffing all the tun_key
>>>> information into skb->cb.
>>> 
>>> I agree that allocating and freeing the metadata for each packet is
>>> something that we probably want to avoid.
>> 
>> I'm glad I didn't code-up my idea :)
>> 
>>>> The patch does make some effort to retain the existing tun_id behaviour.
>>>> I imagine this is required for compatibility reasons.
>>>> 
>>>> The patch makes no attempt to use tun_key other than compatibility
>>>> with the tun_id behaviour.
>>> 
>>> I actually wouldn't worry about kernel side compatibility here too
>>> much.  My general thought is that people should either use the OVS
>>> kernel module with the userspace of the same version or the upstream
>>> module with userspace >= 1.4.  Since none of the tunneling code is
>>> upstream yet we don't have to worry about compatibility and we'll have
>>> to break it anyways if we want to eventually get everything upstream
>>> and have it be clean.
>>> 
>>> The one place where we will need to maintain compatibility is from OVS
>>> userspace to the outside world.  However, I think that everything we
>>> currently do today can be implemented in terms of flow based tunneling
>>> in userspace.  Therefore, I would go ahead and start making the full
>>> changes around how tunnel lookup, for example, is done.
>> 
>> Understood. I have got as far as breaking the existing tun_id behaviour,
>> but not actually providing a replacement in the kernel. I also haven't got
>> as far as touching usespace in any meaningful way. But I agree that
>> userspace is a fine palce for the compatibility code.
>> 
>>>> diff --git a/datapath/flow.h b/datapath/flow.h
>>>> index 5261fa8..c39023a 100644
>>>> --- a/datapath/flow.h
>>>> +++ b/datapath/flow.h
>>>> +struct sw_tun_key {
>>>> +       __kernel_sa_family_t    tun_af;
>>>> +       union {
>>>> +               __be64                          tun_id;
>>>> +               struct ovs_key_ipv4_tunnel      tun_ipv4;
>>>> +               struct ovs_key_ipv6_tunnel      tun_ipv6;
>>>> +       };
>>>> +};
>>> 
>>> I think we probably want to pack this struct a little better at some
>>> point since tun_af actually only needs to be a single bit and will
>>> also have a fair amount of padding after it.
>> 
>> I agree there is some room for improvement. For now I have dropped this
>> structure alltogether in favour of using ovs_key_ipv4_tunnel directly as
>> the code currently only supports IPv4.
>> 
>> This turned out to simplify things a bit. But I wonder if the
>> code is no longer sufficiently generic.
>> 
>>>>  struct sw_flow_key {
>>>>        struct {
>>>> -               __be64  tun_id;         /* Encapsulating tunnel ID. */
>>>> -               u32     priority;       /* Packet QoS priority. */
>>>> -               u16     in_port;        /* Input switch port (or 
>>>> DP_MAX_PORTS). */
>>>> +               struct sw_tun_key       tun_key;        /* Encapsulating 
>>>> tunnel key. */
>>> 
>>> Eventually, at least, I'd like to move this to the end of the struct
>>> so that it doesn't affect performance for non-tunneled traffic.
>> 
>> Sure. Do you mean moving phy (which is what contains tun_key) to the end
>> of sw_flow_key. Or moving tun_key out of phy?
>> 
>>>> diff --git a/datapath/tunnel.c b/datapath/tunnel.c
>>>> index d406dbc..0305fbd 100644
>>>> --- a/datapath/tunnel.c
>>>> +++ b/datapath/tunnel.c
>>>> @@ -613,7 +613,8 @@ static void ipv6_build_icmp(struct sk_buff *skb, 
>>>> struct sk_buff *nskb,
>>>> 
>>>>  bool ovs_tnl_frag_needed(struct vport *vport,
>>>>                         const struct tnl_mutable_config *mutable,
>>>> -                        struct sk_buff *skb, unsigned int mtu, __be64 
>>>> flow_key)
>>>> +                        struct sk_buff *skb, unsigned int mtu,
>>>> +                        struct sw_tun_key *tun_key)
>>>>  {
>>>>        unsigned int eth_hdr_len = ETH_HLEN;
>>>>        unsigned int total_length = 0, header_length = 0, payload_length;
>>>> @@ -706,7 +707,7 @@ bool ovs_tnl_frag_needed(struct vport *vport,
>>>>         */
>>>>        if ((mutable->flags & (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) 
>>>> ==
>>>>            (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION))
>>>> -               OVS_CB(nskb)->tun_id = flow_key;
>>>> +               OVS_CB(nskb)->tun_key = *tun_key;
>>> 
>>> I think the paradigm of echoing back the tunnel information on the
>>> PMTUD packet breaks down here.  At least in simple cases (but even
>>> here not all) keys are often symmetric between receive and transmit so
>>> it's possible to use the same key.  However, once you start including
>>> addresses, it's obviously no longer symmetric.  In retrospect, I think
>>> this implementation of PMTUD was a mistake since it's a hack that
>>> works in many circumstances but not all.  Eventually, it would
>>> probably be better to replace it with an implementation of MSS
>>> clamping in userspace (which also has the advantage of being
>>> implementable completely in userspace).
>> 
>> Understood.
>> 
>> For now I have hacked up a solution that swaps the saddr and daddr in the
>> tun_key. However, I am unsure if that is ever valid. If not, perhaps
>> MSS clamping needs to be implemented before switching to flow-based
>> tunneling?
>> 
>>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>>> index 0578b5f..da32f7f 100644
>>>> --- a/include/linux/openvswitch.h
>>>> +++ b/include/linux/openvswitch.h
>>>> @@ -278,7 +278,9 @@ enum ovs_key_attr {
>>>>        OVS_KEY_ATTR_ICMPV6,    /* struct ovs_key_icmpv6 */
>>>>        OVS_KEY_ATTR_ARP,       /* struct ovs_key_arp */
>>>>        OVS_KEY_ATTR_ND,        /* struct ovs_key_nd */
>>>> -       OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */
>>>> +       OVS_KEY_ATTR_TUN_ID,    /* be64 tunnel ID */
>>>> +       OVS_KEY_ATTR_IPV4_TUNNEL,      /* struct ovs_key_ipv4_tunnel */
>>>> +       OVS_KEY_ATTR_IPV6_TUNNEL = 63, /* struct ovs_key_ipv6_tunnel */
>>> 
>>> The high valued enum value is just to separate the upstream types from
>>> non-upstream.  Since we plan to push this upstream, you can drop it
>>> (also shrink some data types).
>> 
>> I have droped the = 63, though I'm a little unsure if that is what you meant.
>> 
>>>> +struct ovs_key_ipv4_tunnel {
>>>> +       __be64 tun_id;
>>>> +       __be32 ipv4_src;
>>>> +       __be32 ipv4_dst;
>>>> +       __u8   ipv4_tos;
>>>> +       __u8   ipv4_ttl;
>>>> +       __u8   tun_proto;       /* One of TNL_T_PROTO_* */
>>>> +       __u8   reserved;
>>>> +};
>>>> +
>>>> +struct ovs_key_ipv6_tunnel {
>>>> +       __be64 tun_id;
>>>> +       __be32 ipv6_src[4];
>>>> +       __be32 ipv6_dst[4];
>>>> +       __be32 ipv6_label;      /* 20-bits in least-significant bits. */
>>>> +       __u8   ipv6_tclass;
>>>> +       __u8   ipv6_hlimit;
>>>> +       __u8   tun_proto;       /* One of TNL_T_PROTO_* */
>>>> +       __u8   reserved;
>>>> +};
>>> 
>>> I'm not sure that tun_proto is necessary here - it's implied by the
>>> port that is output to/received from.
>> 
>> Thanks. I have dropped both tun_proto and reserved. Is it necessary
>> to pad the structure to a multiple of 64 bytes (or any other value)?
>> 
>> 
>> Today is the last day before a four-day weekend in Japan, so I wanted
>> to get what I have out for you to see as I find your review particularly
>> valuable. I appologise that the patch is somewhat rough around the edges.
> 
> It seems that some last minute changes lead to some undefined symbols.
> 
> The fresh patch, bellow, should resolve this. Specifically, tun_key_init()
> is provided. I have also removed the tos value from ovs_tnl_rcv() as
> it may now be obtained from OVS_CB(skb)->tun_key.
> 
> [RFC v2] datapath: tunnelling: Replace tun_id with tun_key
> 
> this is a first pass at providing a tun_key which can be used
> as the basis for flow-based tunnelling. The tun_key includes and
> replaces the tun_id in both struct ovs_skb_cb and struct sw_tun_key.
> 
> In ovs_skb_cb tun_key is a pointer as it is envisaged that it will grow
> when support for IPv6 to an extent that inlining the structure will result
> in ovs_skb_cb being larger than the 48 bytes available in skb->cb.
> 
> As OVS does not support IPv6 as the outer transport protocol for tunnels
> the IPv6 portions of this change, which appeared in the previous revision,
> have been dropped in order to limit the scope and size of this patch.
> 
> This patch does not make any effort to retain the existing tun_id behaviour
> nor does it fully implement flow-based tunnels. As such it it is incomplete
> and can't be used in its current form (other than to break OVS tunneling).
> 

This looks good Simon, thanks for pulling pieces out of my larger change set 
and making it easier to apply this. This looks good to me!

Acked-by: Kyle Mestery <kmest...@cisco.com>

> ** Please do not apply **
> 
> Cc: Kyle Mestery <kmest...@cisco.com>
> Signed-off-by: Simon Horman <ho...@verge.net.au>
> 
> diff --git a/datapath/actions.c b/datapath/actions.c
> index 2903801..2a58a0a 100644
> --- a/datapath/actions.c
> +++ b/datapath/actions.c
> @@ -342,8 +342,8 @@ static int execute_set_action(struct sk_buff *skb,
>               skb->priority = nla_get_u32(nested_attr);
>               break;
> 
> -     case OVS_KEY_ATTR_TUN_ID:
> -             OVS_CB(skb)->tun_id = nla_get_be64(nested_attr);
> +     case OVS_KEY_ATTR_IPV4_TUNNEL:
> +             OVS_CB(skb)->tun_key = &OVS_CB(skb)->flow->key.phy.tun_key;
>               break;
> 
>       case OVS_KEY_ATTR_ETHERNET:
> @@ -469,7 +469,7 @@ int ovs_execute_actions(struct datapath *dp, struct 
> sk_buff *skb)
>               goto out_loop;
>       }
> 
> -     OVS_CB(skb)->tun_id = 0;
> +     OVS_CB(skb)->tun_key = NULL;
>       error = do_execute_actions(dp, skb, acts->actions,
>                                        acts->actions_len, false);
> 
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index 826dc89..6c6cf09 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> @@ -776,7 +776,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, 
> struct genl_info *info)
> 
>       err = ovs_flow_metadata_from_nlattrs(&flow->key.phy.priority,
>                                            &flow->key.phy.in_port,
> -                                          &flow->key.phy.tun_id,
> +                                          &flow->key.phy.tun_key,
>                                            a[OVS_PACKET_ATTR_KEY]);
>       if (err)
>               goto err_flow_put;
> diff --git a/datapath/datapath.h b/datapath/datapath.h
> index 18c8598..8568283 100644
> --- a/datapath/datapath.h
> +++ b/datapath/datapath.h
> @@ -96,7 +96,7 @@ struct datapath {
> /**
>  * struct ovs_skb_cb - OVS data in skb CB
>  * @flow: The flow associated with this packet.  May be %NULL if no flow.
> - * @tun_id: ID of the tunnel that encapsulated this packet.  It is 0 if the
> + * @tun_key: Key for the tunnel that encapsulated this packet.
>  * @ip_summed: Consistently stores L4 checksumming status across different
>  * kernel versions.
>  * @csum_start: Stores the offset from which to start checksumming independent
> @@ -107,7 +107,7 @@ struct datapath {
>  */
> struct ovs_skb_cb {
>       struct sw_flow          *flow;
> -     __be64                  tun_id;
> +     struct ovs_key_ipv4_tunnel  *tun_key;
> #ifdef NEED_CSUM_NORMALIZE
>       enum csum_type          ip_summed;
>       u16                     csum_start;
> @@ -192,4 +192,5 @@ struct sk_buff *ovs_vport_cmd_build_info(struct vport *, 
> u32 pid, u32 seq,
>                                        u8 cmd);
> 
> int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb);
> +
> #endif /* datapath.h */
> diff --git a/datapath/flow.c b/datapath/flow.c
> index 9f93550..ad59003 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> @@ -629,7 +629,8 @@ int ovs_flow_extract(struct sk_buff *skb, u16 in_port, 
> struct sw_flow_key *key,
>       memset(key, 0, sizeof(*key));
> 
>       key->phy.priority = skb->priority;
> -     key->phy.tun_id = OVS_CB(skb)->tun_id;
> +     if (OVS_CB(skb)->tun_key)
> +             key->phy.tun_key = *OVS_CB(skb)->tun_key;
>       key->phy.in_port = in_port;
> 
>       skb_reset_mac_header(skb);
> @@ -1022,9 +1023,11 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, 
> int *key_lenp,
>               swkey->phy.in_port = DP_MAX_PORTS;
>       }
> 
> -     if (attrs & (1ULL << OVS_KEY_ATTR_TUN_ID)) {
> -             swkey->phy.tun_id = nla_get_be64(a[OVS_KEY_ATTR_TUN_ID]);
> -             attrs &= ~(1ULL << OVS_KEY_ATTR_TUN_ID);
> +     if (attrs & (1ULL << OVS_KEY_ATTR_IPV4_TUNNEL)) {
> +             struct ovs_key_ipv4_tunnel *tun_key;
> +             tun_key = nla_data(a[OVS_KEY_ATTR_IPV4_TUNNEL]);
> +             swkey->phy.tun_key = *tun_key;
> +             attrs &= ~(1ULL << OVS_KEY_ATTR_IPV4_TUNNEL);
>       }
> 
>       /* Data attributes. */
> @@ -1162,14 +1165,15 @@ int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, 
> int *key_lenp,
>  * get the metadata, that is, the parts of the flow key that cannot be
>  * extracted from the packet itself.
>  */
> -int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 
> *tun_id,
> +int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port,
> +                                struct ovs_key_ipv4_tunnel *tun_key,
>                                  const struct nlattr *attr)
> {
>       const struct nlattr *nla;
>       int rem;
> 
>       *in_port = DP_MAX_PORTS;
> -     *tun_id = 0;
> +     tun_key->tun_id = 0;
>       *priority = 0;
> 
>       nla_for_each_nested(nla, attr, rem) {
> @@ -1184,8 +1188,9 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 
> *in_port, __be64 *tun_id,
>                               *priority = nla_get_u32(nla);
>                               break;
> 
> -                     case OVS_KEY_ATTR_TUN_ID:
> -                             *tun_id = nla_get_be64(nla);
> +                     case OVS_KEY_ATTR_IPV4_TUNNEL:
> +                             memcpy(tun_key, nla_data(nla),
> +                                    sizeof(*tun_key));
>                               break;
> 
>                       case OVS_KEY_ATTR_IN_PORT:
> @@ -1204,15 +1209,18 @@ int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 
> *in_port, __be64 *tun_id,
> int ovs_flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb)
> {
>       struct ovs_key_ethernet *eth_key;
> +     struct ovs_key_ipv4_tunnel *key;
>       struct nlattr *nla, *encap;
> 
>       if (swkey->phy.priority &&
>           nla_put_u32(skb, OVS_KEY_ATTR_PRIORITY, swkey->phy.priority))
>               goto nla_put_failure;
> 
> -     if (swkey->phy.tun_id != cpu_to_be64(0) &&
> -         nla_put_be64(skb, OVS_KEY_ATTR_TUN_ID, swkey->phy.tun_id))
> +     nla = nla_reserve(skb, OVS_KEY_ATTR_IPV4_TUNNEL, sizeof(*key));
> +     if (!nla)
>               goto nla_put_failure;
> +     key = nla_data(nla);
> +     *key = swkey->phy.tun_key;
> 
>       if (swkey->phy.in_port != DP_MAX_PORTS &&
>           nla_put_u32(skb, OVS_KEY_ATTR_IN_PORT, swkey->phy.in_port))
> diff --git a/datapath/flow.h b/datapath/flow.h
> index 5261fa8..f24237e 100644
> --- a/datapath/flow.h
> +++ b/datapath/flow.h
> @@ -42,7 +42,7 @@ struct sw_flow_actions {
> 
> struct sw_flow_key {
>       struct {
> -             __be64  tun_id;         /* Encapsulating tunnel ID. */
> +             struct ovs_key_ipv4_tunnel tun_key;  /* Encapsulating tunnel 
> key. */
>               u32     priority;       /* Packet QoS priority. */
>               u16     in_port;        /* Input switch port (or DP_MAX_PORTS). 
> */
>       } phy;
> @@ -165,7 +165,8 @@ u64 ovs_flow_used_time(unsigned long flow_jiffies);
> int ovs_flow_to_nlattrs(const struct sw_flow_key *, struct sk_buff *);
> int ovs_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp,
>                     const struct nlattr *);
> -int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 
> *tun_id,
> +int ovs_flow_metadata_from_nlattrs(u32 *priority, u16 *in_port,
> +                                struct ovs_key_ipv4_tunnel *tun_key,
>                                  const struct nlattr *);
> 
> #define MAX_ACTIONS_BUFSIZE   (16 * 1024)
> @@ -204,4 +205,21 @@ u32 ovs_flow_hash(const struct sw_flow_key *key, int 
> key_len);
> struct sw_flow *ovs_flow_tbl_next(struct flow_table *table, u32 *bucket, u32 
> *idx);
> extern const int ovs_key_lens[OVS_KEY_ATTR_MAX + 1];
> 
> +static inline void tun_key_swap_addr(struct ovs_key_ipv4_tunnel *tun_key)
> +{
> +     __be32 ndst = tun_key->ipv4_src;
> +     tun_key->ipv4_src = tun_key->ipv4_dst;
> +     tun_key->ipv4_dst = ndst;
> +}
> +
> +static inline void tun_key_init(struct ovs_key_ipv4_tunnel *tun_key,
> +                             struct iphdr *iph, __be64 tun_id)
> +{
> +     tun_key->tun_id = tun_id;
> +     tun_key->ipv4_src = iph->saddr;
> +     tun_key->ipv4_dst = iph->daddr;
> +     tun_key->ipv4_tos = iph->tos;
> +     tun_key->ipv4_ttl = iph->ttl;
> +}
> +
> #endif /* flow.h */
> diff --git a/datapath/tunnel.c b/datapath/tunnel.c
> index d406dbc..18dedee 100644
> --- a/datapath/tunnel.c
> +++ b/datapath/tunnel.c
> @@ -367,9 +367,9 @@ struct vport *ovs_tnl_find_port(struct net *net, __be32 
> saddr, __be32 daddr,
>       return NULL;
> }
> 
> -static void ecn_decapsulate(struct sk_buff *skb, u8 tos)
> +static void ecn_decapsulate(struct sk_buff *skb)
> {
> -     if (unlikely(INET_ECN_is_ce(tos))) {
> +     if (unlikely(INET_ECN_is_ce(OVS_CB(skb)->tun_key->ipv4_tos))) {
>               __be16 protocol = skb->protocol;
> 
>               skb_set_network_header(skb, ETH_HLEN);
> @@ -416,7 +416,7 @@ static void ecn_decapsulate(struct sk_buff *skb, u8 tos)
>  * - skb->csum does not include the inner Ethernet header.
>  * - The layer pointers are undefined.
>  */
> -void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb, u8 tos)
> +void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb)
> {
>       struct ethhdr *eh;
> 
> @@ -433,7 +433,7 @@ void ovs_tnl_rcv(struct vport *vport, struct sk_buff 
> *skb, u8 tos)
>       skb_clear_rxhash(skb);
>       secpath_reset(skb);
> 
> -     ecn_decapsulate(skb, tos);
> +     ecn_decapsulate(skb);
>       vlan_set_tci(skb, 0);
> 
>       if (unlikely(compute_ip_summed(skb, false))) {
> @@ -613,12 +613,14 @@ static void ipv6_build_icmp(struct sk_buff *skb, struct 
> sk_buff *nskb,
> 
> bool ovs_tnl_frag_needed(struct vport *vport,
>                        const struct tnl_mutable_config *mutable,
> -                      struct sk_buff *skb, unsigned int mtu, __be64 flow_key)
> +                      struct sk_buff *skb, unsigned int mtu,
> +                      struct ovs_key_ipv4_tunnel *tun_key)
> {
>       unsigned int eth_hdr_len = ETH_HLEN;
>       unsigned int total_length = 0, header_length = 0, payload_length;
>       struct ethhdr *eh, *old_eh = eth_hdr(skb);
>       struct sk_buff *nskb;
> +     struct ovs_key_ipv4_tunnel ntun_key;
> 
>       /* Sanity check */
>       if (skb->protocol == htons(ETH_P_IP)) {
> @@ -705,8 +707,11 @@ bool ovs_tnl_frag_needed(struct vport *vport,
>        * any way of synthesizing packets.
>        */
>       if ((mutable->flags & (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) ==
> -         (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION))
> -             OVS_CB(nskb)->tun_id = flow_key;
> +         (TNL_F_IN_KEY_MATCH | TNL_F_OUT_KEY_ACTION)) {
> +             ntun_key = *tun_key;
> +             ipv4_tunnel_key_swap_addr(&ntun_key);
> +             OVS_CB(nskb)->tun_key = &ntun_key;
> +     }
> 
>       if (unlikely(compute_ip_summed(nskb, false))) {
>               kfree_skb(nskb);
> @@ -761,7 +766,7 @@ static bool check_mtu(struct sk_buff *skb,
> 
>                       if (packet_length > mtu &&
>                           ovs_tnl_frag_needed(vport, mutable, skb, mtu,
> -                                             OVS_CB(skb)->tun_id))
> +                                             OVS_CB(skb)->tun_key))
>                               return false;
>               }
>       }
> @@ -778,7 +783,7 @@ static bool check_mtu(struct sk_buff *skb,
> 
>                       if (packet_length > mtu &&
>                           ovs_tnl_frag_needed(vport, mutable, skb, mtu,
> -                                             OVS_CB(skb)->tun_id))
> +                                             OVS_CB(skb)->tun_key))
>                               return false;
>               }
>       }
> diff --git a/datapath/tunnel.h b/datapath/tunnel.h
> index 33eb63c..6087d05 100644
> --- a/datapath/tunnel.h
> +++ b/datapath/tunnel.h
> @@ -269,14 +269,15 @@ int ovs_tnl_set_addr(struct vport *vport, const 
> unsigned char *addr);
> const char *ovs_tnl_get_name(const struct vport *vport);
> const unsigned char *ovs_tnl_get_addr(const struct vport *vport);
> int ovs_tnl_send(struct vport *vport, struct sk_buff *skb);
> -void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb, u8 tos);
> +void ovs_tnl_rcv(struct vport *vport, struct sk_buff *skb);
> 
> struct vport *ovs_tnl_find_port(struct net *net, __be32 saddr, __be32 daddr,
>                               __be64 key, int tunnel_type,
>                               const struct tnl_mutable_config **mutable);
> bool ovs_tnl_frag_needed(struct vport *vport,
>                        const struct tnl_mutable_config *mutable,
> -                      struct sk_buff *skb, unsigned int mtu, __be64 
> flow_key);
> +                      struct sk_buff *skb, unsigned int mtu,
> +                      struct ovs_key_ipv4_tunnel *tun_key);
> void ovs_tnl_free_linked_skbs(struct sk_buff *skb);
> 
> int ovs_tnl_init(void);
> diff --git a/datapath/vport-capwap.c b/datapath/vport-capwap.c
> index e5b7afb..c5399b1 100644
> --- a/datapath/vport-capwap.c
> +++ b/datapath/vport-capwap.c
> @@ -220,7 +220,7 @@ static struct sk_buff *capwap_update_header(const struct 
> vport *vport,
>               struct capwaphdr_wsi *wsi = (struct capwaphdr_wsi *)(cwh + 1);
>               struct capwaphdr_wsi_key *opt = (struct capwaphdr_wsi_key 
> *)(wsi + 1);
> 
> -             opt->key = OVS_CB(skb)->tun_id;
> +             opt->key = OVS_CB(skb)->tun_key->tun_id;
>       }
> 
>       udph->len = htons(skb->len - skb_transport_offset(skb));
> @@ -316,6 +316,7 @@ static int capwap_rcv(struct sock *sk, struct sk_buff 
> *skb)
>       struct vport *vport;
>       const struct tnl_mutable_config *mutable;
>       struct iphdr *iph;
> +     struct ovs_key_ipv4_tunnel tun_key;
>       __be64 key = 0;
> 
>       if (unlikely(!pskb_may_pull(skb, CAPWAP_MIN_HLEN + ETH_HLEN)))
> @@ -333,12 +334,11 @@ static int capwap_rcv(struct sock *sk, struct sk_buff 
> *skb)
>               goto error;
>       }
> 
> -     if (mutable->flags & TNL_F_IN_KEY_MATCH)
> -             OVS_CB(skb)->tun_id = key;
> -     else
> -             OVS_CB(skb)->tun_id = 0;
> +     tun_key_init(&tun_key, iph,
> +                  mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
> +     OVS_CB(skb)->tun_key = &tun_key;
> 
> -     ovs_tnl_rcv(vport, skb, iph->tos);
> +     ovs_tnl_rcv(vport, skb);
>       goto out;
> 
> error:
> diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
> index 3bb55f0..d0b1b70 100644
> --- a/datapath/vport-gre.c
> +++ b/datapath/vport-gre.c
> @@ -101,10 +101,6 @@ static struct sk_buff *gre_update_header(const struct 
> vport *vport,
>       __be32 *options = (__be32 *)(skb_network_header(skb) + 
> mutable->tunnel_hlen
>                                              - GRE_HEADER_SECTION);
> 
> -     /* Work backwards over the options so the checksum is last. */
> -     if (mutable->flags & TNL_F_OUT_KEY_ACTION)
> -             *options = be64_get_low32(OVS_CB(skb)->tun_id);
> -
>       if (mutable->out_key || mutable->flags & TNL_F_OUT_KEY_ACTION)
>               options--;
> 
> @@ -285,7 +281,11 @@ static void gre_err(struct sk_buff *skb, u32 info)
> #endif
> 
>       __skb_pull(skb, tunnel_hdr_len);
> -     ovs_tnl_frag_needed(vport, mutable, skb, mtu, key);
> +     {
> +             struct ovs_key_ipv4_tunnel tun_key;
> +             ipv4_tunnel_key_init(&tun_key, iph, key);
> +             ovs_tnl_frag_needed(vport, mutable, skb, mtu, &tun_key);
> +     }
>       __skb_push(skb, tunnel_hdr_len);
> 
> out:
> @@ -327,6 +327,7 @@ static int gre_rcv(struct sk_buff *skb)
>       const struct tnl_mutable_config *mutable;
>       int hdr_len;
>       struct iphdr *iph;
> +     struct ovs_key_ipv4_tunnel tun_key;
>       __be16 flags;
>       __be64 key;
> 
> @@ -351,15 +352,15 @@ static int gre_rcv(struct sk_buff *skb)
>               goto error;
>       }
> 
> -     if (mutable->flags & TNL_F_IN_KEY_MATCH)
> -             OVS_CB(skb)->tun_id = key;
> -     else
> -             OVS_CB(skb)->tun_id = 0;
> +
> +     ipv4_tunnel_key_init(&tun_key, iph,
> +                          mutable->flags & TNL_F_IN_KEY_MATCH ? key : 0);
> +     OVS_CB(skb)->tun_key = &tun_key;
> 
>       __skb_pull(skb, hdr_len);
>       skb_postpull_rcsum(skb, skb_transport_header(skb), hdr_len + ETH_HLEN);
> 
> -     ovs_tnl_rcv(vport, skb, iph->tos);
> +     ovs_tnl_rcv(vport, skb);
>       return 0;
> 
> error:
> diff --git a/datapath/vport.c b/datapath/vport.c
> index b75a866..20c68c5 100644
> --- a/datapath/vport.c
> +++ b/datapath/vport.c
> @@ -462,7 +462,7 @@ void ovs_vport_receive(struct vport *vport, struct 
> sk_buff *skb)
>               OVS_CB(skb)->flow = NULL;
> 
>       if (!(vport->ops->flags & VPORT_F_TUN_ID))
> -             OVS_CB(skb)->tun_id = 0;
> +             OVS_CB(skb)->tun_key = NULL;
> 
>       ovs_dp_process_received_packet(vport, skb);
> }
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index 0578b5f..4485c2f 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -278,7 +278,8 @@ enum ovs_key_attr {
>       OVS_KEY_ATTR_ICMPV6,    /* struct ovs_key_icmpv6 */
>       OVS_KEY_ATTR_ARP,       /* struct ovs_key_arp */
>       OVS_KEY_ATTR_ND,        /* struct ovs_key_nd */
> -     OVS_KEY_ATTR_TUN_ID = 63, /* be64 tunnel ID */
> +     OVS_KEY_ATTR_TUN_ID,    /* be64 tunnel ID */
> +     OVS_KEY_ATTR_IPV4_TUNNEL,  /* struct ovs_key_ipv4_tunnel */
>       __OVS_KEY_ATTR_MAX
> };
> 
> @@ -360,6 +361,14 @@ struct ovs_key_nd {
>       __u8  nd_tll[6];
> };
> 
> +struct ovs_key_ipv4_tunnel {
> +     __be64 tun_id;
> +     __be32 ipv4_src;
> +     __be32 ipv4_dst;
> +     __u8   ipv4_tos;
> +     __u8   ipv4_ttl;
> +};
> +
> /**
>  * enum ovs_flow_attr - attributes for %OVS_FLOW_* commands.
>  * @OVS_FLOW_ATTR_KEY: Nested %OVS_KEY_ATTR_* attributes specifying the flow
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a33fe23..ea5bf17 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -1154,6 +1154,7 @@ execute_set_action(struct ofpbuf *packet, const struct 
> nlattr *a)
>     case OVS_KEY_ATTR_TUN_ID:
>     case OVS_KEY_ATTR_PRIORITY:
>     case OVS_KEY_ATTR_IPV6:
> +    case OVS_KEY_ATTR_IPV4_TUNNEL:
>         /* not implemented */
>         break;
> 
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 8fa3359..65aa938 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -107,6 +107,7 @@ ovs_key_attr_to_string(enum ovs_key_attr attr)
>     case OVS_KEY_ATTR_ARP: return "arp";
>     case OVS_KEY_ATTR_ND: return "nd";
>     case OVS_KEY_ATTR_TUN_ID: return "tun_id";
> +    case OVS_KEY_ATTR_IPV4_TUNNEL: return "ipv4_tunnel";
> 
>     case __OVS_KEY_ATTR_MAX:
>     default:
> @@ -523,6 +524,7 @@ odp_flow_key_attr_len(uint16_t type)
>     case OVS_KEY_ATTR_ICMPV6: return sizeof(struct ovs_key_icmpv6);
>     case OVS_KEY_ATTR_ARP: return sizeof(struct ovs_key_arp);
>     case OVS_KEY_ATTR_ND: return sizeof(struct ovs_key_nd);
> +    case OVS_KEY_ATTR_IPV4_TUNNEL: return sizeof(struct ovs_key_ipv4_tunnel);
> 
>     case OVS_KEY_ATTR_UNSPEC:
>     case __OVS_KEY_ATTR_MAX:
> @@ -577,6 +579,7 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds)
>     const struct ovs_key_icmpv6 *icmpv6_key;
>     const struct ovs_key_arp *arp_key;
>     const struct ovs_key_nd *nd_key;
> +    const struct ovs_key_ipv4_tunnel *ipv4_tun_key;
>     enum ovs_key_attr attr = nl_attr_type(a);
>     int expected_len;
> 
> @@ -607,6 +610,15 @@ format_odp_key_attr(const struct nlattr *a, struct ds 
> *ds)
>         ds_put_format(ds, "(%#"PRIx64")", ntohll(nl_attr_get_be64(a)));
>         break;
> 
> +    case OVS_KEY_ATTR_IPV4_TUNNEL:
> +        ipv4_tun_key = nl_attr_get(a);
> +        ds_put_format(ds, "(src="IP_FMT",dst="IP_FMT
> +                      ",tos=%#"PRIx8",ttl=%"PRIu8")",
> +                      IP_ARGS(&ipv4_tun_key->ipv4_src),
> +                      IP_ARGS(&ipv4_tun_key->ipv4_dst),
> +                      ipv4_tun_key->ipv4_tos, ipv4_tun_key->ipv4_ttl);
> +        break;
> +
>     case OVS_KEY_ATTR_IN_PORT:
>         ds_put_format(ds, "(%"PRIu32")", nl_attr_get_u32(a));
>         break;
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev


_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to