On 8/25/14 3:33 AM, Jesse Gross wrote:
On Thu, Aug 21, 2014 at 12:24 PM, Lori Jakab <loja...@cisco.com> wrote:
On 8/6/14 4:37 AM, Jesse Gross wrote:
Besides the fact that it would be nice to unify things, I'm not sure
that it is actually correct to pull off VLAN, MPLS, etc. tags that we
don't understand. Presumably, these are being used to create some kind
of isolation and dropping all the tags that we don't understand would
just blindly merge things together. Also, from an implementation
perspective, we don't really parse beyond what we understand (this is
not quite true for MPLS but it is for VLANs and any unknown protocols)
so we wouldn't necessarily actually get to an L3 header. On the other
hand, if we only deal with tags that we do understand then can't we
have more atomic instructions that pull them off one-by-one?

Atomic instructions make a lot of sense, and we definitely should support
them.  However, if a packet is sent to an L3 port, it should not have any
Ethernet, VLAN or MPLS header. Note that for now these actions are not user
visible, and cannot be added with OpenFlow, they are added automatically by
the user space code, when a packet is sent from an L2->L3 port or L3->L2
port.

I agreed with Ben that this behavior will be changed in the future, based on
discussions in the ONF's EXT-112 proposal, where the consensus was to have
explicit push_eth and pop_eth OpenFlow actions.  However, the current
behavior of the LISP code is to add/remove L2 headers automatically, so we
want to keep that in the first phase, until I implement OpenFlow support.
Once automatic header popping is not used, we can remove the
"extract_l3_packet()" action.

 From the implementation perspective, this action would rely on the
skb->network_header being set correctly.
I'm not sure that this really addresses the concerns that I have from
both policy and implementation perspectives.

 From the policy perspective: Will a Cisco implementation of LISP
accept a packet on an unknown VLAN, throw away the tag when it does L2
processing, and then continue on to do L3 processing and LISP? Somehow
I doubt it.

 From an implementation perspective: There is no mechanism to rely on
skb->network_header being set in the general case. It will only be set
in situations where OVS can parse the packet. OVS doesn't know about
QinQ for example so you the action that you have defined won't
actually give you an L3 packet.

Finally, from a Linux perspective, removing kernel interfaces isn't
something that is OK as a matter of policy.

The fact that the OVS kernel module doesn't know how to inherently
find the L3 is not as bad as it seems. Conveniently, it means that we
can pop off exactly the same headers as we know how to parse so there
is no loss of functionality to doing this individually. As bonus,
since userspace needs to generate these actions, the policy can be
enforced there as well.

All compelling arguments, so I agree, let's not add an "extract_l3_packet()" action and have pop_eth() only remove an Ethernet header. VLAN tags, if any, should be removed by a previous pop_vlan() action before calling pop_eth(). Does this implementation look correct to you?

--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -421,9 +421,9 @@ static int set_eth_addr(struct sk_buff *skb,

 static int pop_eth(struct sk_buff *skb)
 {
-       skb_pull_rcsum(skb, skb_network_offset(skb));
+       skb_pull_rcsum(skb, ETH_HLEN);
        skb_reset_mac_header(skb);
-       vlan_set_tci(skb, 0);
+       skb->mac_len -= ETH_HLEN;

        OVS_CB(skb)->is_layer3 = true;

--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1831,6 +1831,8 @@ static int __ovs_nla_copy_actions(const struct nlattr *attr,
                case OVS_ACTION_ATTR_POP_ETH:
                        if (noeth)
                                return -EINVAL;
+                       if (vlan_tci != htons(0))
+                               return -EINVAL;
                        noeth = true;
                        break;

I think is_layer3 in the control block should not be used on output anymore, since we cannot guarantee it is correct. So another change I suggest is removing the check for a packet being layer 2 or layer 3 in the various ${vport}-send() functions, and keep its use limited to receiving.

--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -258,6 +258,7 @@ static void push_eth(struct sk_buff *skb, const
struct
ovs_action_push_eth *ethh
            skb_push(skb, ETH_HLEN);
            skb_reset_mac_header(skb);
            skb_reset_mac_len(skb);
+       vlan_set_tci(skb, 0);
I don't think it's right to throw away the vlan tag. My assumption is
that it should be placed into the packet before we push on the
Ethernet header (this actually should never happen since we disallow
pushing multiple Ethernet headers but it still seems like the right
thing to do).

That's not what I would expect from a simple push_eth() action.  If
someone
wanted to push a VLAN tag, that should be an explicit action.  So I
would
just throw away whatever there is in skb->vlan_tci.
The contents of skb->vlan_tci were already pushed explicitly (look at
the implementation of push_vlan()). From a user's perspective, there
is no difference - it is just an internal optimization to pull the tag
out of the packet, so I can't see any justification for throwing away
that tag.

OK.  I will remove that line.
But just to be clear, I think we still need to explicitly push the
vlan tag on the packet. Otherwise, it will migrate from the inner
Ethernet header to the outer.

Still not sure what I need to do here.  If skb->vlan_tci was set by a
push_vlan action, then the vlan tag should already have been pushed on the
packet, right?
Yes, the user has requested that a vlan tag be applied. skb->vlan_tci
is a form of offloading that should be transparent and will no longer
work in this case, so it needs to be de-offloaded. Please take a look
at the OVS vlan operations and other vlan code in the kernel if this
isn't clear.

I wasn't aware of how hardware offloading of VLAN tags worked. How about the following incremental?

--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -430,21 +430,38 @@ static int pop_eth(struct sk_buff *skb)
        return 0;
 }

-static void push_eth(struct sk_buff *skb, const struct ovs_action_push_eth *ethh) +static int push_eth(struct sk_buff *skb, const struct ovs_action_push_eth *ethh)
 {
+ /* De-accelerate any hardware accelerated VLAN tag added to a previous
+        * Ethernet header */
+       if (unlikely(vlan_tx_tag_present(skb))) {
+               u16 tx_tag = vlan_tx_tag_get(skb);
+
+               if (!__vlan_put_tag(skb, skb->vlan_proto, tx_tag))
+                       return -ENOMEM;
+
+               if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_add(skb->csum, csum_partial(skb->data
+                                       + (2 * ETH_ALEN), VLAN_HLEN, 0));
+       }
+
+       /* Add the new Ethernet header */
+       if (skb_cow_head(skb, ETH_HLEN) < 0)
+               return -ENOMEM;
+
        skb_push(skb, ETH_HLEN);
        skb_reset_mac_header(skb);
        skb_reset_mac_len(skb);

        ether_addr_copy(eth_hdr(skb)->h_source, ethh->addresses.eth_src);
        ether_addr_copy(eth_hdr(skb)->h_dest, ethh->addresses.eth_dst);
-
        eth_hdr(skb)->h_proto = ethh->eth_type;
-       skb->protocol = ethh->eth_type;

        ovs_skb_postpush_rcsum(skb, skb->data, ETH_HLEN);

+       skb->protocol = ethh->eth_type;
        OVS_CB(skb)->is_layer3 = false;
+       return 0;
 }

 static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh,
@@ -995,7 +1012,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
                        break;

                case OVS_ACTION_ATTR_PUSH_ETH:
-                       push_eth(skb, nla_data(a));
+                       err = push_eth(skb, nla_data(a));
                        break;

                case OVS_ACTION_ATTR_POP_ETH:


BTW, for a packet with multiple Ethernet headers, is there already
support
for matching on anything after the outer header?  Does that need
recirculation?
There isn't currently any support on matching on multiple headers but
it seems like a natural possibility if we introduce the capability to
push/pop Ethernet headers. Once those are there, it seems like you
could do it with recirculation.

Somewhat related, what happens if someone specifies just an EtherType
but no Ethernet addresses? Maybe we should just reject this?

In a flow key?  I don't have a strong opinion on this, but I think I
wouldn't reject it.
Is there any situation where it could be used (what meaning would it
have)?

Not that I can think of right now.


Normally we reject things that can never make sense.

If that's the policy, I'll change the implementation.
OK, I think that probably makes sense (we can loosen it later if we
come up with a use case and add support for it).

I just want to note that the push_eth action has a single struct with all
the data.  So the netlink message will always have the Ethertype AND the
Ethernet addresses.  The only thing I can think of validating here is if the
Ethernet addresses are not all zero, but I think we should allow even that.
I'm not sure that I understand this comment. When pushing on a new
Ethernet header, wouldn't you expect that both the addresses and the
EtherType is present?

Before I was talking about validating the match, not the action part.

Sorry, I got confused in my previous response. Is this the Right Way(tm) to do what you propose?

--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -136,6 +136,10 @@ static bool match_validate(const struct sw_flow_match *match,
                       | (1ULL << OVS_KEY_ATTR_IN_PORT)
                       | (1ULL << OVS_KEY_ATTR_ETHERTYPE));

+       /* If Ethertype is present, expect MAC addresses */
+       if (key_attrs & (1ULL << OVS_KEY_ATTR_ETHERTYPE))
+               key_expected |= 1ULL << OVS_KEY_ATTR_ETHERNET;
+
        /* Check key attributes. */
        if (match->key->eth.type == htons(ETH_P_ARP)
                        || match->key->eth.type == htons(ETH_P_RARP)) {

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to