Re: [ovs-dev] [PATCH] RFC: Geneve OpenFlow support.

Jesse Gross Wed, 04 Feb 2015 23:22:08 -0800

On Fri, Jan 23, 2015 at 10:10 AM, Madhu Challa <cha...@noironetworks.com> wrote:
> This commit adds support for Geneve flow mods via two nicira extensions.
> NXM_NX_TUN_METADATA for being able to match on tun_metadata nxms and
> NXAST_RAW_TUN_METADATA for tun_metadata actions.


This is great, thanks!

> The other side affect of keeping a table is that when flows are deleted
> we cannot necesarily remove and resuse the holes in the option space without
> additional complexity. I have not yet figured out how to maintain a reference
> count of the flows that are using a particular entry.

Hmm, this is both nasty and important. I don't a good idea on how to
solve this at the moment, hopefully someone else does...

> ovs-ofctl is expanded to be able to pass multiple tun_metadata options.
>
> example:
> ovs-ofctl add-flow br0 
> "tun_id=0x32,tun_src=12.1.1.5,tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,in_port=1
>  actions=output:2"

My guess is that we probably need some additional parsing/formatting
support in order to make this usable to humans. Parsing is definitely
useful for small scale testing (i.e. so we can write a flow that
matches on class=0xa, type=0xb, value=0xc without breaking out a
calculator). However, probably even more important is formatting
because while these flows can obviously be generated by a controller
without too much issue, they will be really hard to debug.

> Actions
> =======
> A new action tun_metadata is added. Multiple of these actions can appear in
> a flowmod. The actions consult the same table to figure out mappings.

The biggest concern that I have with this patch is how to handle
actions. I think it's probably important to be able to load non-static
data into a field - something that is derived from a register or
protocol field. Do you have any ideas how this might be possible?

I'm not sure that I fully understand the reason that actions need to
consult the table since there's no lookup involved. I vaguely remember
discussing sorting them to avoid different orders but a table seems a
little bit like an overkill. Can you describe it a little more?

Can you also update the man pages and add some unit tests?

I hope that Ben or Jarno will comment on the metaflow and NXM parts
when the time is right, so I mostly focused my comments on the ODP and
tun metadata areas.

> diff --git a/lib/meta-flow.h b/lib/meta-flow.h
> index 4a6c443..4646650 100644
> --- a/lib/meta-flow.h
> +++ b/lib/meta-flow.h
> @@ -1377,6 +1377,27 @@ enum OVS_PACKED_ENUM mf_field_id {
>       */
>      MFF_ND_TLL,
>
> +    /*
> +     * "tun_metadata".
> +     *
> +     * Encap metadata for tunnels.
> +     *
> +     * Each NXM can carry upto 255 bytes of encap metadata when not masked or
> +     * upto 127 bytes of encap metadata followed by equal length mask when
> +     * masked.

It seems vaguely weird to allow different lengths depending on whether
there is a mask or not. I know that this is due to NXM encoding
limitations and not something that we're explicitly enforcing but it
still seems a little odd.

> +     * Type: bytestring.
> +     * Maskable: bitwise.
> +     * Formatting: tun_metadata.
> +     * Prerequisites: none.
> +     * Access: read/write.

My guess is that we probably can't usefully write to the field at
moment (I think).

> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 37d73a9..37e2ec3 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
>  #define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type))
> -static int
> +static int OVS_UNUSED
>  parse_geneve_opts(const struct nlattr *attr)

If parse_geneve_opts() isn't actually useful, we can just dump it. It
was just a placeholder, really.

> @@ -1354,12 +1355,11 @@ odp_tun_key_from_attr(const struct nlattr *attr, 
> struct flow_tnl *tun)
>              tun->flags |= FLOW_TNL_F_OAM;
>              break;
>          case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: {
> -            if (parse_geneve_opts(a)) {
> -                return ODP_FIT_ERROR;
> -            }
>              /* It is necessary to reproduce options exactly (including order)
>               * so it's easiest to just echo them back. */

This comment is referring to setting unknown = true, so we can remove
it. In the current OVS implementation, everything is echoed back (this
wasn't true before).

> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index e694fd9..70ecf92 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> +/* Action structure for NXAST_TUN_METADATA. */
> +struct nx_action_tun_metadata {
> +    ovs_be16 type;
> +    ovs_be16 len;
> +    ovs_be32 vendor;
> +    ovs_be16 subtype;
> +    uint8_t data_len;
> +    uint8_t zero[4];
> +    uint8_t data[256];
> +};

Do we need the separate data_len field? Isn't it implied by the earlier len?

> diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
> new file mode 100644
> index 0000000..3748bdf
> --- /dev/null
> +++ b/lib/tun-metadata.c

> +/* callers responsibility to verify entry->len is same as len.
> + * returns an entry or null. */
> +static struct tun_meta_entry *
> +tun_meta_add(struct tun_meta_table *table, uint32_t key, uint16_t len)
> +{
> +    struct tun_meta_entry *entry = tun_meta_find_key(table, key);
> +    if (entry == NULL) {
> +        ovs_mutex_lock(&table->mutex);

This seems potentially race-y to me. Can't two threads do a lookup and
find nothing and then both try to insert?

> +static void
> +tun_meta_remove_all(struct unixctl_conn *conn, int argc OVS_UNUSED,
> +                   const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> +{
> +    struct tun_meta_table *table = &tun_meta_table;
> +    struct tun_meta_entry *entry;
> +
> +    if (!initialized) {
> +        tun_meta_init();

I guess there's no real point in initializing things now.

> +static void
> +tun_meta_print_keys(struct unixctl_conn *conn, int argc OVS_UNUSED,
> +                    const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> +{
> +    struct ds ds = DS_EMPTY_INITIALIZER;
> +    const struct tun_meta_table *table = &tun_meta_table;
> +    struct cmap_cursor cursor;
> +    struct tun_meta_entry *entry;
> +    const struct cmap *key_cmap = &table->key_cmap;
> +    ds_put_cstr(&ds,"     Key      Len   Offset\n");
> +    ds_put_cstr(&ds,"==========================\n");
> +    CMAP_CURSOR_FOR_EACH(entry, key_node, &cursor, key_cmap) {
> +        ds_put_format(&ds, "%8x %8d %8d\n", entry->key, entry->len, 
> entry->ofs);

We might want to decode the key a little further so it's possible to
see what's going on by looking at it.

> +void
> +tun_meta_init(void)
> +{
> +    struct tun_meta_table *table = &tun_meta_table;
> +    cmap_init(&table->key_cmap);
> +    cmap_init(&table->ofs_cmap);
> +    ovs_mutex_init(&table->mutex);
> +    table->next_ofs = 0;
> +    table->count = 0;
> +    unixctl_command_register("tnl/meta/show", "", 0, 0, tun_meta_print_keys, 
> NULL);
> +    unixctl_command_register("tnl/meta/flush", "", 0, 0, 
> tun_meta_remove_all, NULL);

It's a little annoying that the unixctl commands aren't available
until the first tunnel flow is added. I would guess that the behavior
is non-obvious to the user.

> +void
> +tun_meta_destroy(void)
> +{

I don't think that this ever gets called. At a minimum, it could
probably be used by tun_meta_remove_all().

> +#define TUN_META_KEY(metadata) \
> +    ((metadata)[0] << 16 | (metadata)[1] << 8 | (metadata)[2])

I think we probably should make the parsing of the metadata (which
part is the option class, type, value) a little more explicit or at
least document it more clearly.

> +/* copies a single tun_meta entry at the correct offset in
> + * metadata and updates the map if needed. */
> +void match_set_tun_metadata(struct match *match,
> +                            const uint8_t metadata[TUN_METADATA_LEN],
> +                            int len)

I think we probably need a more graceful way of handling table full
conditions. Ideally, we would reject flow installation.

> +int
> +geneve_nlattr_to_tun_metadata(const struct nlattr *attr,
> +                              uint8_t metadata[TUN_METADATA_LEN])
> +{
> +    int opts_len = nl_attr_get_size(attr);
> +    const struct geneve_opt *opt = nl_attr_get(attr);
> +
> +    while (opts_len > 0) {
> +        int len;
> +        struct tun_meta_entry *entry;
> +
> +        if (opts_len < sizeof(*opt)) {
> +            return -EINVAL;
> +        }
> +
> +        len = sizeof(*opt) + opt->length * 4;
> +        if (len > opts_len) {
> +            return -EINVAL;
> +        }
> +
> +        entry = find_tun_meta_entry((uint8_t *)opt);
> +        if (entry && entry->len == (len - 1)) {

If there is an entry and the length doesn't match should we drop
rather than ignore the option?

> +int tun_metadata_to_geneve_nlattr(const uint8_t metadata[TUN_METADATA_LEN],
> +                                  uint8_t nlattr[TUN_METADATA_LEN])
> +{

I wonder if it makes more sense to do this processing at OpenFlow
installation time? It would seem to be more efficient than kernel flow
setup time. But I guess this is still pending based on the larger
question of how to best handle actions.

> +            if ((len - 3) % 4) {
> +                mofs += len;
> +                ovs_assert(mofs < TUN_METADATA_LEN);
> +                continue;
> +            }

It seems like we should complain more in this case. I think ideally we
would reject the flow installation.

> diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h
> new file mode 100644
> index 0000000..4a58232
> --- /dev/null
> +++ b/lib/tun-metadata.h
> @@ -0,0 +1,44 @@
> +/*
> + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.

Wrong company :)

> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 0786513..5eb8f74 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4113,6 +4115,17 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t 
> ofpacts_len,
>          case OFPACT_SAMPLE:
>              xlate_sample_action(ctx, ofpact_get_SAMPLE(a));
>              break;
> +
> +        case OFPACT_TUN_METADATA: {
> +            struct ofpact_tun_metadata *otm = ofpact_get_TUN_METADATA(a);
> +            int ofs = action_find_or_add_tun_meta_entry(otm->data,
> +                                                        otm->data_len);
> +            if (ofs != -1) {
> +                memcpy(flow->tunnel.metadata + ofs, otm->data, 
> otm->data_len);
> +                memset(wc->masks.tunnel.metadata + ofs, 0xff, otm->data_len);

I don't think that we need to set the wildcards here, they don't
really have any effect for tunneling metadata.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] RFC: Geneve OpenFlow support.

Reply via email to