On Fri, Jan 23, 2015 at 10:10 AM, Madhu Challa <cha...@noironetworks.com> wrote: > This commit adds support for Geneve flow mods via two nicira extensions. > NXM_NX_TUN_METADATA for being able to match on tun_metadata nxms and > NXAST_RAW_TUN_METADATA for tun_metadata actions.
This is great, thanks! > The other side affect of keeping a table is that when flows are deleted > we cannot necesarily remove and resuse the holes in the option space without > additional complexity. I have not yet figured out how to maintain a reference > count of the flows that are using a particular entry. Hmm, this is both nasty and important. I don't a good idea on how to solve this at the moment, hopefully someone else does... > ovs-ofctl is expanded to be able to pass multiple tun_metadata options. > > example: > ovs-ofctl add-flow br0 > "tun_id=0x32,tun_src=12.1.1.5,tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,in_port=1 > actions=output:2" My guess is that we probably need some additional parsing/formatting support in order to make this usable to humans. Parsing is definitely useful for small scale testing (i.e. so we can write a flow that matches on class=0xa, type=0xb, value=0xc without breaking out a calculator). However, probably even more important is formatting because while these flows can obviously be generated by a controller without too much issue, they will be really hard to debug. > Actions > ======= > A new action tun_metadata is added. Multiple of these actions can appear in > a flowmod. The actions consult the same table to figure out mappings. The biggest concern that I have with this patch is how to handle actions. I think it's probably important to be able to load non-static data into a field - something that is derived from a register or protocol field. Do you have any ideas how this might be possible? I'm not sure that I fully understand the reason that actions need to consult the table since there's no lookup involved. I vaguely remember discussing sorting them to avoid different orders but a table seems a little bit like an overkill. Can you describe it a little more? Can you also update the man pages and add some unit tests? I hope that Ben or Jarno will comment on the metaflow and NXM parts when the time is right, so I mostly focused my comments on the ODP and tun metadata areas. > diff --git a/lib/meta-flow.h b/lib/meta-flow.h > index 4a6c443..4646650 100644 > --- a/lib/meta-flow.h > +++ b/lib/meta-flow.h > @@ -1377,6 +1377,27 @@ enum OVS_PACKED_ENUM mf_field_id { > */ > MFF_ND_TLL, > > + /* > + * "tun_metadata". > + * > + * Encap metadata for tunnels. > + * > + * Each NXM can carry upto 255 bytes of encap metadata when not masked or > + * upto 127 bytes of encap metadata followed by equal length mask when > + * masked. It seems vaguely weird to allow different lengths depending on whether there is a mask or not. I know that this is due to NXM encoding limitations and not something that we're explicitly enforcing but it still seems a little odd. > + * Type: bytestring. > + * Maskable: bitwise. > + * Formatting: tun_metadata. > + * Prerequisites: none. > + * Access: read/write. My guess is that we probably can't usefully write to the field at moment (I think). > diff --git a/lib/odp-util.c b/lib/odp-util.c > index 37d73a9..37e2ec3 100644 > --- a/lib/odp-util.c > +++ b/lib/odp-util.c > #define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type)) > -static int > +static int OVS_UNUSED > parse_geneve_opts(const struct nlattr *attr) If parse_geneve_opts() isn't actually useful, we can just dump it. It was just a placeholder, really. > @@ -1354,12 +1355,11 @@ odp_tun_key_from_attr(const struct nlattr *attr, > struct flow_tnl *tun) > tun->flags |= FLOW_TNL_F_OAM; > break; > case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: { > - if (parse_geneve_opts(a)) { > - return ODP_FIT_ERROR; > - } > /* It is necessary to reproduce options exactly (including order) > * so it's easiest to just echo them back. */ This comment is referring to setting unknown = true, so we can remove it. In the current OVS implementation, everything is echoed back (this wasn't true before). > diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c > index e694fd9..70ecf92 100644 > --- a/lib/ofp-actions.c > +++ b/lib/ofp-actions.c > +/* Action structure for NXAST_TUN_METADATA. */ > +struct nx_action_tun_metadata { > + ovs_be16 type; > + ovs_be16 len; > + ovs_be32 vendor; > + ovs_be16 subtype; > + uint8_t data_len; > + uint8_t zero[4]; > + uint8_t data[256]; > +}; Do we need the separate data_len field? Isn't it implied by the earlier len? > diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c > new file mode 100644 > index 0000000..3748bdf > --- /dev/null > +++ b/lib/tun-metadata.c > +/* callers responsibility to verify entry->len is same as len. > + * returns an entry or null. */ > +static struct tun_meta_entry * > +tun_meta_add(struct tun_meta_table *table, uint32_t key, uint16_t len) > +{ > + struct tun_meta_entry *entry = tun_meta_find_key(table, key); > + if (entry == NULL) { > + ovs_mutex_lock(&table->mutex); This seems potentially race-y to me. Can't two threads do a lookup and find nothing and then both try to insert? > +static void > +tun_meta_remove_all(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) > +{ > + struct tun_meta_table *table = &tun_meta_table; > + struct tun_meta_entry *entry; > + > + if (!initialized) { > + tun_meta_init(); I guess there's no real point in initializing things now. > +static void > +tun_meta_print_keys(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) > +{ > + struct ds ds = DS_EMPTY_INITIALIZER; > + const struct tun_meta_table *table = &tun_meta_table; > + struct cmap_cursor cursor; > + struct tun_meta_entry *entry; > + const struct cmap *key_cmap = &table->key_cmap; > + ds_put_cstr(&ds," Key Len Offset\n"); > + ds_put_cstr(&ds,"==========================\n"); > + CMAP_CURSOR_FOR_EACH(entry, key_node, &cursor, key_cmap) { > + ds_put_format(&ds, "%8x %8d %8d\n", entry->key, entry->len, > entry->ofs); We might want to decode the key a little further so it's possible to see what's going on by looking at it. > +void > +tun_meta_init(void) > +{ > + struct tun_meta_table *table = &tun_meta_table; > + cmap_init(&table->key_cmap); > + cmap_init(&table->ofs_cmap); > + ovs_mutex_init(&table->mutex); > + table->next_ofs = 0; > + table->count = 0; > + unixctl_command_register("tnl/meta/show", "", 0, 0, tun_meta_print_keys, > NULL); > + unixctl_command_register("tnl/meta/flush", "", 0, 0, > tun_meta_remove_all, NULL); It's a little annoying that the unixctl commands aren't available until the first tunnel flow is added. I would guess that the behavior is non-obvious to the user. > +void > +tun_meta_destroy(void) > +{ I don't think that this ever gets called. At a minimum, it could probably be used by tun_meta_remove_all(). > +#define TUN_META_KEY(metadata) \ > + ((metadata)[0] << 16 | (metadata)[1] << 8 | (metadata)[2]) I think we probably should make the parsing of the metadata (which part is the option class, type, value) a little more explicit or at least document it more clearly. > +/* copies a single tun_meta entry at the correct offset in > + * metadata and updates the map if needed. */ > +void match_set_tun_metadata(struct match *match, > + const uint8_t metadata[TUN_METADATA_LEN], > + int len) I think we probably need a more graceful way of handling table full conditions. Ideally, we would reject flow installation. > +int > +geneve_nlattr_to_tun_metadata(const struct nlattr *attr, > + uint8_t metadata[TUN_METADATA_LEN]) > +{ > + int opts_len = nl_attr_get_size(attr); > + const struct geneve_opt *opt = nl_attr_get(attr); > + > + while (opts_len > 0) { > + int len; > + struct tun_meta_entry *entry; > + > + if (opts_len < sizeof(*opt)) { > + return -EINVAL; > + } > + > + len = sizeof(*opt) + opt->length * 4; > + if (len > opts_len) { > + return -EINVAL; > + } > + > + entry = find_tun_meta_entry((uint8_t *)opt); > + if (entry && entry->len == (len - 1)) { If there is an entry and the length doesn't match should we drop rather than ignore the option? > +int tun_metadata_to_geneve_nlattr(const uint8_t metadata[TUN_METADATA_LEN], > + uint8_t nlattr[TUN_METADATA_LEN]) > +{ I wonder if it makes more sense to do this processing at OpenFlow installation time? It would seem to be more efficient than kernel flow setup time. But I guess this is still pending based on the larger question of how to best handle actions. > + if ((len - 3) % 4) { > + mofs += len; > + ovs_assert(mofs < TUN_METADATA_LEN); > + continue; > + } It seems like we should complain more in this case. I think ideally we would reject the flow installation. > diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h > new file mode 100644 > index 0000000..4a58232 > --- /dev/null > +++ b/lib/tun-metadata.h > @@ -0,0 +1,44 @@ > +/* > + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. Wrong company :) > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 0786513..5eb8f74 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -4113,6 +4115,17 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t > ofpacts_len, > case OFPACT_SAMPLE: > xlate_sample_action(ctx, ofpact_get_SAMPLE(a)); > break; > + > + case OFPACT_TUN_METADATA: { > + struct ofpact_tun_metadata *otm = ofpact_get_TUN_METADATA(a); > + int ofs = action_find_or_add_tun_meta_entry(otm->data, > + otm->data_len); > + if (ofs != -1) { > + memcpy(flow->tunnel.metadata + ofs, otm->data, > otm->data_len); > + memset(wc->masks.tunnel.metadata + ofs, 0xff, otm->data_len); I don't think that we need to set the wildcards here, they don't really have any effect for tunneling metadata. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev