Hi Jesse,
I hope you had a great vacation.
Now that you're back, please take a look at suggested fixes before I
formally submit v3 of the L3 patch for review.
On 1/6/14, 7:55 PM, Jesse Gross wrote:
On Tue, Dec 24, 2013 at 9:02 AM, Lorand Jakab<loja...@cisco.com> wrote:
Implementation of the pop_eth and push_eth actions in the kernel, also
layer 3 flow support. Jesse Gross provided feedback on a previous
version of this RFC patch, all of those comments are resolved here.
Signed-off-by: Lorand Jakab<loja...@cisco.com>
Minor thing on the commit message: the history of the review process
doesn't usually belong there since nobody really cares once the patch
has been committed. You can put things like that using three dashes
after the commit message, which will cause git to automatically trim
them out when the patch is applied.
Right, I forgot that, will rephrase.
diff --git a/datapath/actions.c b/datapath/actions.c
index 30ea1d2..b90e715 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
+static int pop_eth(struct sk_buff *skb)
+{
+ skb_pull(skb, skb_network_offset(skb));
+ return 0;
+}
I think there needs to be some additional validation in this area. If
there's no Ethernet header then this will be a no-op but generally we
reject such actions at flow installation time rather than ignoring at
runtime. Furthermore, what does it mean for the validation of existing
L2 actions like set Ethernet or push vlan, which assume that an
Ethernet header is always present?
Shouldn't we also update skb->csum?
How about this incremental?
diff --git a/datapath/actions.c b/datapath/actions.c
index 802abce..adddadc 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -145,6 +145,9 @@ static int set_eth_addr(struct sk_buff *skb,
static int pop_eth(struct sk_buff *skb)
{
+ if (skb->ip_summed == CHECKSUM_COMPLETE)
+ skb->csum = csum_sub(skb->csum, csum_partial(skb->data,
+ skb_network_offset(skb), 0));
skb_pull(skb, skb_network_offset(skb));
return 0;
}
diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
index 7d193d8..8f0d5ab 100644
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1289,6 +1289,14 @@ static int validate_tp_port(const struct
sw_flow_key *flow_key)
return -EINVAL;
}
+static int validate_eth_present(const struct sw_flow_key *flow_key)
+{
+ if (flow_key->phy.noeth)
+ return -EINVAL;
+
+ return 0;
+}
+
void ovs_match_init(struct sw_flow_match *match,
struct sw_flow_key *key,
struct sw_flow_mask *mask)
@@ -1352,7 +1360,12 @@ static int validate_set(const struct nlattr *a,
case OVS_KEY_ATTR_PRIORITY:
case OVS_KEY_ATTR_SKB_MARK:
+ break;
+
case OVS_KEY_ATTR_ETHERNET:
+ err = validate_eth_present(flow_key);
+ if (err)
+ return err;
break;
case OVS_KEY_ATTR_TUNNEL:
@@ -1507,6 +1520,9 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
case OVS_ACTION_ATTR_POP_ETH:
+ err = validate_eth_present(key);
+ if (err)
+ return err;
break;
case OVS_ACTION_ATTR_PUSH_ETH:
@@ -1516,6 +1532,9 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
break;
case OVS_ACTION_ATTR_PUSH_VLAN:
+ err = validate_eth_present(key);
+ if (err)
+ return err;
vlan = nla_data(a);
if (vlan->vlan_tpid != htons(ETH_P_8021Q))
return -EINVAL;
+static int push_eth(struct sk_buff *skb, const struct ovs_action_push_eth
*ethh)
+{
+ int err;
+
+ skb_push(skb, ETH_HLEN);
+ skb_reset_mac_header(skb);
+
+ err = set_eth_addr(skb, ðh->addresses);
+ if (unlikely(err))
+ return err;
I would just do the memcpy directly here. set_eth_addr() has a bunch
of extra things that make it not very efficient for what we are trying
to do. memcpy would also remove the error path.
OK.
Should we update skb->protocol here?
Right, will do that, thanks!
diff --git a/datapath/flow.h b/datapath/flow.h
index eafcfd8..df2fb05 100644
--- a/datapath/flow.h
+++ b/datapath/flow.h
struct sw_flow_key {
struct ovs_key_ipv4_tunnel tun_key; /* Encapsulating tunnel key. */
+ bool noeth; /* Packet has no Ethernet header */
struct {
u32 priority; /* Packet QoS priority. */
u32 skb_mark; /* SKB mark. */
I would put the noeth bool inside of struct phy so that we don't
introduce additional holes in the flow structure. I think it also
logically fits in there.
Will fix.
diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
index 9b26528..8f71c49 100644
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -521,6 +522,8 @@ static int ovs_key_from_nlattrs(struct sw_flow_match
*match, bool *exact_5tuple
SW_FLOW_KEY_MEMCPY(match, eth.dst,
eth_key->eth_dst, ETH_ALEN, is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_ETHERNET);
+ } else {
+ SW_FLOW_KEY_PUT(match, noeth, true, is_mask);
}
This macro sets not only the value but also ranges of the key that are
significant. Therefore, we should explicitly set the value in all
cases, rather than assuming the default is false.
I think there's a potential for ambiguity here when megaflows are
used. If there is no Ethernet attribute does that mean that it must be
an L3 packet or does it mean that it is wildcarded as 'don-t care'? As
currently implemented, I believe that it would result in the former
but that would change behavior.
How about the following instead:
SW_FLOW_KEY_MEMCPY(match, eth.dst,
eth_key->eth_dst, ETH_ALEN, is_mask);
+ SW_FLOW_KEY_PUT(match, phy.noeth, false, is_mask);
attrs &= ~(1ULL << OVS_KEY_ATTR_ETHERNET);
- } else {
+ } else if (!is_mask)
SW_FLOW_KEY_PUT(match, phy.noeth, true, is_mask);
- }
@@ -951,7 +954,7 @@ int ovs_nla_put_flow(const struct sw_flow_key *swkey,
const struct sw_flow_key *output, struct sk_buff *skb)
{
struct ovs_key_ethernet *eth_key;
- struct nlattr *nla, *encap;
+ struct nlattr *nla, *encap = NULL;
I think we can remove the explicit setting of encap to NULL later on
in the code now, it's a little cleaner.
OK.
+noethernet:
if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, output->eth.type))
goto nla_put_failure;
Does it still make sense to send the EtherType? It's not present in
the packet and I believe that the information is contained in the
attributes that we do send (i.e. IPv4 or v6 attributes).
We had a discussion about this in August last year:
One such decision is how to handle the flow key. I set all fields in
key->eth to 0, except the type, because we still need to know what kind
of L3 packet do we have. Since a lot of code is accessing
key->eth.type, this is easier than having this information in a
different place, although it would be more elegant to set this field to
0 as well.
Can we use skb->protocol for the cases where we currently access the
EtherType? Are there cases that you are particularly concerned about?
The only case I'm concerned about is in the action validation code,
particularly the validate_set() and validate_tp_port() functions, since
they
don't have access to the skb, and need to know the layer 3 protocol of
the
flow. In validate_set() we could relax the check for eth.type for
OVS_KEY_ATTR_IPV4 and OVS_KEY_ATTR_IPV6, but I'm not sure what to do
about
validate_tp_port().
In general, I think it would be a good idea for the flow key to have a
field
specifying the layer 3 protocol, when an ethernet header is not present.
That makes sense to me.
You mean that we keep eth.type as the L3 protocol field, or define a new
one, separate from the eth union?
I think it's fine to keep using eth.type as the protocol field. I
think we can probably some holes to move the is_layer3 flag into the
non-tunnel portion of the flow though.
Should we revisit this discussion?
@@ -1543,6 +1552,13 @@ int ovs_nla_copy_actions(const struct nlattr *attr,
break;
+ case OVS_ACTION_ATTR_POP_ETH:
+ break;
+
+ case OVS_ACTION_ATTR_PUSH_ETH:
+ /* TODO May need to validate eth_type? */
I think it's not strictly required for correctness of operation in OVS
(meaning it won't cause us to crash and anything else that looks at
the packet should do validation first like any packet coming off the
wire). Simon did some work in this area with MPLS (where it is
required for correctness) and it's pretty complicated so I would be
inclined to not do anything here unless we find a compelling reason.
I'll remove the comment.
diff --git a/datapath/vport-gre.c b/datapath/vport-gre.c
index 8737b63..2737cd2 100644
--- a/datapath/vport-gre.c
+++ b/datapath/vport-gre.c
@@ -112,6 +112,7 @@ static int gre_rcv(struct sk_buff *skb,
key = key_to_tunnel_id(tpi->key, tpi->seq);
ovs_flow_tun_key_init(&tun_key, ip_hdr(skb), key,
filter_tnl_flags(tpi->flags));
+ OVS_CB(skb)->is_layer3 = false;
ovs_vport_receive(vport, skb, &tun_key);
I would make is_layer3 an argument to ovs_vport_receive(). That way
it's impossible to forget to initialize it (I think it's missing from
vport-internal_dev.c).
Yeap, it was missing. I'll follow your recommendation.
diff --git a/datapath/vport-lisp.c b/datapath/vport-lisp.c
index c2698ae..b56eec7 100644
--- a/datapath/vport-lisp.c
+++ b/datapath/vport-lisp.c
On the transmit path, we still assume that there might be an L2 header
and pull it off. Can we enforce a restriction that it must be an L3
packet at this point?
Good point, we should definitely do that. I was thinking that we can
use OVS_CB(skb)->pkt_key.phy.noeth in lisp_send() (and lisp_rcv()) for
detecting packet type, is that ok?
Conversely, what about L2 ports that get packets
without an L2 header?
That should not normally happen, as I replied to Ben in the user space
part. However, in case it does, since it's unexpected behavior, should
we drop?
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev