On Aug 11, 2014, at 5:39 PM, Jesse Gross <je...@nicira.com> wrote:

> On Mon, Aug 11, 2014 at 9:15 AM, Jarno Rajahalme <jrajaha...@nicira.com> 
> wrote:
>> Masked set action allows more megaflow wildcarding.  Masked set action
>> is now supported for all writeable key types, except for the tunnel
>> key.
>> 
>> The set tunnel action is an exception as any input tunnel info is
>> cleared before action processing starts, so there is no tunnel info to
>> mask.
>> 
>> The kernel module converts all (non-tunnel) set actions to masked set
>> actions.  This makes action processing more uniform, and results in
>> less branching and duplicating the action processing code.
>> 
>> Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com>
> 
> This triggers some sparse warnings that would be nice to avoid:
> 
>  CHECK   /home/jesse/openvswitch/datapath/linux/actions.c
> /home/jesse/openvswitch/datapath/linux/actions.c:498:44: warning:
> incorrect type in return expression (different base types)
> /home/jesse/openvswitch/datapath/linux/actions.c:498:44:    expected bool
> /home/jesse/openvswitch/datapath/linux/actions.c:498:44:    got
> restricted __be32
> 

Thanks!

>> diff --git a/datapath/actions.c b/datapath/actions.c
>> index 25c5d77..d57d779 100644
>> --- a/datapath/actions.c
>> +++ b/datapath/actions.c
>> @@ -432,35 +462,47 @@ static int set_ipv4(struct sk_buff *skb, const struct 
>> ovs_key_ipv4 *ipv4_key)
>> 
>>        nh = ip_hdr(skb);
>> 
>> -       if (ipv4_key->ipv4_src != nh->saddr) {
>> -               set_ip_addr(skb, nh, &nh->saddr, ipv4_key->ipv4_src);
>> -               flow_key_set_ipv4_src(skb, ipv4_key->ipv4_src);
>> -       }
>> +       /* Setting an IP addresses is typically only a side effect of
>> +        * matching on them in the current userspace implementation, so it
>> +        * makes sense to check if the value actually changed. */
>> +       if (mask->ipv4_src) {
>> +               new_addr = MASKED(nh->saddr, key->ipv4_src, mask->ipv4_src);
>> 
>> -       if (ipv4_key->ipv4_dst != nh->daddr) {
> 
> Here we are checking first that the mask is non-zero and then later
> that the values are different. In other places we do only the second
> step. What is the rationale?
> 

If checksumming is required, then it makes sense to check if the value actually 
changed. If masking is not very cheap, then it makes sense to check if the mask 
is non-zero before doing the masking, especially if it can be expected that the 
mask is typically zero.

>From the perspective of masked set operation the cleanest thing to do is to 
>just check the mask, as that carries the intent to change the field value. 
>However, in the current userspace implementation we set the mask also when 
>matching on the field, not just when it is set, which makes it rather likely 
>that the value does not actually change for fields like IP destination 
>address. Therefore that additional cheek for the actual change.

In other places (like in when setting ports) there are fewer fields, so the 
likelihood that a mask for a field is zero is lower so I figured there is no 
need to check for the mask values, as masking with a zero mask will keep the 
current value. I’ll add comment explaining this.

>> +/* Mask is at the midpoint of the data. */
>> +#define get_mask(a, type) \
>> +       ((const type *)((const char *)nla_data(a) + nla_len(a)))
> 
> This doesn't seem right to me. From a netlink perspective, the
> attribute length covers the whole thing, both the value and mask. So
> won't this always find the end of the attribute, not the middle?

Oops :-) This should work better:

#define get_mask(a, type) ((const type *)nla_data(a) + 1)

> 
>> diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
>> index e525c9d..937f4d6 100644
>> --- a/datapath/flow_netlink.c
>> +++ b/datapath/flow_netlink.c
>> @@ -1543,42 +1563,62 @@ static int validate_set(const struct nlattr *a,
>>                break;
>> 
>>        case OVS_KEY_ATTR_TUNNEL:
>> -               *set_tun = true;
>> +               if (masked)
>> +                       return -EINVAL; /* Masked tunnel set not supported. 
>> */
>> +               *skip_copy = true;
>>                err = validate_and_copy_set_tun(a, sfa);
>>                if (err)
>>                        return err;
>> -               break;
>> +               return 0;
> 
> Returning 0 here makes me a little nervous - if we ever add more
> validation at the end of the function, it might be silently skipped.

Reverted, and added a check below so that we do not convert set tunnels to 
masked sets.

> 
>>        case OVS_KEY_ATTR_IPV4:
> [...]
>> -               if (ipv4_key->ipv4_frag != flow_key->ip.frag)
>> -                       return -EINVAL;
>> +                       /* Non-writeable fields. */
>> +                       if (mask->ipv4_proto || mask->ipv4_frag)
>> +                               return -EINVAL;
>> +               } else {
>> +                       if (!flow_key->ip.proto)
>> +                               return -EINVAL;
> 
> I believe that this check on ip.proto is being used to verify that the
> IP header is actually present, so this would mean that we can write
> off the end of the packet in the masked case.

But this is at flow set-up time, and the mask could still wildcard the ip.proto 
field, so it could be zero or missing in an actual packet. Doesn’t the 
make_writeable take care of the verification?:

static int set_ipv4(struct sk_buff *skb, const struct ovs_key_ipv4 *key,
                    const struct ovs_key_ipv4 *mask)
{
        struct iphdr *nh;
        __be32 new_addr;
        int err;

        err = make_writable(skb, skb_network_offset(skb) +
                                 sizeof(struct iphdr));
        if (unlikely(err))
                return err;

        nh = ip_hdr(skb);


> 
>>        case OVS_KEY_ATTR_IPV6:
> [...]
>> -               if (ipv6_key->ipv6_frag != flow_key->ip.frag)
>> -                       return -EINVAL;
>> +                       /* Non-writeable fields. */
>> +                       if (mask->ipv6_proto || mask->ipv6_frag)
>> +                               return -EINVAL;
>> +
>> +                       /* Invalid bits in the flow label mask? */
>> +                       if (ntohl(mask->ipv6_label) & 0xFFF00000)
>> +                               return -EINVAL;
>> +               } else {
>> +                       if (!flow_key->ip.proto)
>> +                               return -EINVAL;
> 
> Same issue with header validation here.
> 
>> @@ -1605,12 +1649,43 @@ static int validate_set(const struct nlattr *a,
>> +       /* Convert non-masked set actions to masked set actions.
>> +        * Tunnel set action returns before getting here. */
>> +       if (!masked) {
>> +               int start, len = key_len * 2;
>> +               struct nlattr *at;
>> +
>> +               *skip_copy = true;
>> +
>> +               start = add_nested_action_start(sfa,
>> +                                               OVS_ACTION_ATTR_SET_MASKED);
>> +               if (start < 0)
>> +                       return start;
>> +
>> +               at = reserve_sfa_size(sfa, nla_attr_size(len));
>> +               if (IS_ERR(at))
>> +                       return PTR_ERR(at);
>> +
>> +               at->nla_type = key_type;
>> +               at->nla_len = nla_attr_size(len);
> 
> I think this can be simplified by using add_action() or __add_action().
> 

Done.

> What about the reverse direction? I think this will return masked
> actions to old userspace, which presumable will choke on them.
> 

We should convert the actions back, but we also need to reject fully masked set 
actions to not confuse newer userspaces.

>> diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
>> b/datapath/linux/compat/include/linux/openvswitch.h
>> index 9ea1f37..a8b318a 100644
>> --- a/datapath/linux/compat/include/linux/openvswitch.h
>> +++ b/datapath/linux/compat/include/linux/openvswitch.h
>> @@ -590,6 +590,12 @@ struct ovs_action_hash {
>>  * indicate the new packet contents. This could potentially still be
>>  * %ETH_P_MPLS if the resulting MPLS label stack is not empty.  If there
>>  * is no MPLS label stack, as determined by ethertype, no action is taken.
>> + * @OVS_ACTION_ATTR_SET_MASKED: Replaces the contents of an existing 
>> header.  A
>> + * nested %OVS_KEY_ATTR_* attribute specifies a header to modify, its value,
>> + * and a mask.  For every bit set in the mask, the corresponding bit value
>> + * is copied from the value to the packet header field, rest of the bits are
>> + * left unchanged.  The non-masked value bits must be passed in as zeroes.
>> + * Masking is not supported for the %OVS_KEY_ATTR_TUNNEL attribute.
> 
> It is probably a good idea to add the attribute and the comment in the
> same patch. Also, I don't know how Pravin feels but in general it is a
> little easier if the kernel code is in a single patch. Otherwise, this
> will have to be transformed before upstreaming.

OK.

  Jarno

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to