On Fri, Jan 23, 2015 at 10:10 AM, Madhu Challa <[email protected]> wrote:
> This commit adds support for Geneve flow mods via two nicira extensions.
> NXM_NX_TUN_METADATA for being able to match on tun_metadata nxms and
> NXAST_RAW_TUN_METADATA for tun_metadata actions.
This is great, thanks!
> The other side affect of keeping a table is that when flows are deleted
> we cannot necesarily remove and resuse the holes in the option space without
> additional complexity. I have not yet figured out how to maintain a reference
> count of the flows that are using a particular entry.
Hmm, this is both nasty and important. I don't a good idea on how to
solve this at the moment, hopefully someone else does...
> ovs-ofctl is expanded to be able to pass multiple tun_metadata options.
>
> example:
> ovs-ofctl add-flow br0
> "tun_id=0x32,tun_src=12.1.1.5,tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,in_port=1
> actions=output:2"
My guess is that we probably need some additional parsing/formatting
support in order to make this usable to humans. Parsing is definitely
useful for small scale testing (i.e. so we can write a flow that
matches on class=0xa, type=0xb, value=0xc without breaking out a
calculator). However, probably even more important is formatting
because while these flows can obviously be generated by a controller
without too much issue, they will be really hard to debug.
> Actions
> =======
> A new action tun_metadata is added. Multiple of these actions can appear in
> a flowmod. The actions consult the same table to figure out mappings.
The biggest concern that I have with this patch is how to handle
actions. I think it's probably important to be able to load non-static
data into a field - something that is derived from a register or
protocol field. Do you have any ideas how this might be possible?
I'm not sure that I fully understand the reason that actions need to
consult the table since there's no lookup involved. I vaguely remember
discussing sorting them to avoid different orders but a table seems a
little bit like an overkill. Can you describe it a little more?
Can you also update the man pages and add some unit tests?
I hope that Ben or Jarno will comment on the metaflow and NXM parts
when the time is right, so I mostly focused my comments on the ODP and
tun metadata areas.
> diff --git a/lib/meta-flow.h b/lib/meta-flow.h
> index 4a6c443..4646650 100644
> --- a/lib/meta-flow.h
> +++ b/lib/meta-flow.h
> @@ -1377,6 +1377,27 @@ enum OVS_PACKED_ENUM mf_field_id {
> */
> MFF_ND_TLL,
>
> + /*
> + * "tun_metadata".
> + *
> + * Encap metadata for tunnels.
> + *
> + * Each NXM can carry upto 255 bytes of encap metadata when not masked or
> + * upto 127 bytes of encap metadata followed by equal length mask when
> + * masked.
It seems vaguely weird to allow different lengths depending on whether
there is a mask or not. I know that this is due to NXM encoding
limitations and not something that we're explicitly enforcing but it
still seems a little odd.
> + * Type: bytestring.
> + * Maskable: bitwise.
> + * Formatting: tun_metadata.
> + * Prerequisites: none.
> + * Access: read/write.
My guess is that we probably can't usefully write to the field at
moment (I think).
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 37d73a9..37e2ec3 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> #define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type))
> -static int
> +static int OVS_UNUSED
> parse_geneve_opts(const struct nlattr *attr)
If parse_geneve_opts() isn't actually useful, we can just dump it. It
was just a placeholder, really.
> @@ -1354,12 +1355,11 @@ odp_tun_key_from_attr(const struct nlattr *attr,
> struct flow_tnl *tun)
> tun->flags |= FLOW_TNL_F_OAM;
> break;
> case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: {
> - if (parse_geneve_opts(a)) {
> - return ODP_FIT_ERROR;
> - }
> /* It is necessary to reproduce options exactly (including order)
> * so it's easiest to just echo them back. */
This comment is referring to setting unknown = true, so we can remove
it. In the current OVS implementation, everything is echoed back (this
wasn't true before).
> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index e694fd9..70ecf92 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> +/* Action structure for NXAST_TUN_METADATA. */
> +struct nx_action_tun_metadata {
> + ovs_be16 type;
> + ovs_be16 len;
> + ovs_be32 vendor;
> + ovs_be16 subtype;
> + uint8_t data_len;
> + uint8_t zero[4];
> + uint8_t data[256];
> +};
Do we need the separate data_len field? Isn't it implied by the earlier len?
> diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
> new file mode 100644
> index 0000000..3748bdf
> --- /dev/null
> +++ b/lib/tun-metadata.c
> +/* callers responsibility to verify entry->len is same as len.
> + * returns an entry or null. */
> +static struct tun_meta_entry *
> +tun_meta_add(struct tun_meta_table *table, uint32_t key, uint16_t len)
> +{
> + struct tun_meta_entry *entry = tun_meta_find_key(table, key);
> + if (entry == NULL) {
> + ovs_mutex_lock(&table->mutex);
This seems potentially race-y to me. Can't two threads do a lookup and
find nothing and then both try to insert?
> +static void
> +tun_meta_remove_all(struct unixctl_conn *conn, int argc OVS_UNUSED,
> + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> +{
> + struct tun_meta_table *table = &tun_meta_table;
> + struct tun_meta_entry *entry;
> +
> + if (!initialized) {
> + tun_meta_init();
I guess there's no real point in initializing things now.
> +static void
> +tun_meta_print_keys(struct unixctl_conn *conn, int argc OVS_UNUSED,
> + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> +{
> + struct ds ds = DS_EMPTY_INITIALIZER;
> + const struct tun_meta_table *table = &tun_meta_table;
> + struct cmap_cursor cursor;
> + struct tun_meta_entry *entry;
> + const struct cmap *key_cmap = &table->key_cmap;
> + ds_put_cstr(&ds," Key Len Offset\n");
> + ds_put_cstr(&ds,"==========================\n");
> + CMAP_CURSOR_FOR_EACH(entry, key_node, &cursor, key_cmap) {
> + ds_put_format(&ds, "%8x %8d %8d\n", entry->key, entry->len,
> entry->ofs);
We might want to decode the key a little further so it's possible to
see what's going on by looking at it.
> +void
> +tun_meta_init(void)
> +{
> + struct tun_meta_table *table = &tun_meta_table;
> + cmap_init(&table->key_cmap);
> + cmap_init(&table->ofs_cmap);
> + ovs_mutex_init(&table->mutex);
> + table->next_ofs = 0;
> + table->count = 0;
> + unixctl_command_register("tnl/meta/show", "", 0, 0, tun_meta_print_keys,
> NULL);
> + unixctl_command_register("tnl/meta/flush", "", 0, 0,
> tun_meta_remove_all, NULL);
It's a little annoying that the unixctl commands aren't available
until the first tunnel flow is added. I would guess that the behavior
is non-obvious to the user.
> +void
> +tun_meta_destroy(void)
> +{
I don't think that this ever gets called. At a minimum, it could
probably be used by tun_meta_remove_all().
> +#define TUN_META_KEY(metadata) \
> + ((metadata)[0] << 16 | (metadata)[1] << 8 | (metadata)[2])
I think we probably should make the parsing of the metadata (which
part is the option class, type, value) a little more explicit or at
least document it more clearly.
> +/* copies a single tun_meta entry at the correct offset in
> + * metadata and updates the map if needed. */
> +void match_set_tun_metadata(struct match *match,
> + const uint8_t metadata[TUN_METADATA_LEN],
> + int len)
I think we probably need a more graceful way of handling table full
conditions. Ideally, we would reject flow installation.
> +int
> +geneve_nlattr_to_tun_metadata(const struct nlattr *attr,
> + uint8_t metadata[TUN_METADATA_LEN])
> +{
> + int opts_len = nl_attr_get_size(attr);
> + const struct geneve_opt *opt = nl_attr_get(attr);
> +
> + while (opts_len > 0) {
> + int len;
> + struct tun_meta_entry *entry;
> +
> + if (opts_len < sizeof(*opt)) {
> + return -EINVAL;
> + }
> +
> + len = sizeof(*opt) + opt->length * 4;
> + if (len > opts_len) {
> + return -EINVAL;
> + }
> +
> + entry = find_tun_meta_entry((uint8_t *)opt);
> + if (entry && entry->len == (len - 1)) {
If there is an entry and the length doesn't match should we drop
rather than ignore the option?
> +int tun_metadata_to_geneve_nlattr(const uint8_t metadata[TUN_METADATA_LEN],
> + uint8_t nlattr[TUN_METADATA_LEN])
> +{
I wonder if it makes more sense to do this processing at OpenFlow
installation time? It would seem to be more efficient than kernel flow
setup time. But I guess this is still pending based on the larger
question of how to best handle actions.
> + if ((len - 3) % 4) {
> + mofs += len;
> + ovs_assert(mofs < TUN_METADATA_LEN);
> + continue;
> + }
It seems like we should complain more in this case. I think ideally we
would reject the flow installation.
> diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h
> new file mode 100644
> index 0000000..4a58232
> --- /dev/null
> +++ b/lib/tun-metadata.h
> @@ -0,0 +1,44 @@
> +/*
> + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
Wrong company :)
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 0786513..5eb8f74 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -4113,6 +4115,17 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t
> ofpacts_len,
> case OFPACT_SAMPLE:
> xlate_sample_action(ctx, ofpact_get_SAMPLE(a));
> break;
> +
> + case OFPACT_TUN_METADATA: {
> + struct ofpact_tun_metadata *otm = ofpact_get_TUN_METADATA(a);
> + int ofs = action_find_or_add_tun_meta_entry(otm->data,
> + otm->data_len);
> + if (ofs != -1) {
> + memcpy(flow->tunnel.metadata + ofs, otm->data,
> otm->data_len);
> + memset(wc->masks.tunnel.metadata + ofs, 0xff, otm->data_len);
I don't think that we need to set the wildcards here, they don't
really have any effect for tunneling metadata.
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev