On Fri, Jun 19, 2015 at 04:13:24PM -0700, Jesse Gross wrote:
> The current support for Geneve in OVS is exactly equivalent to VXLAN:
> it is possible to set and match on the VNI but not on any options
> contained in the header. This patch enables the use of options.
> 
> The goal for Geneve support is not to add support for any particular option
> but to allow end users or controllers to specify what they would like to
> match. That is, the full range of Geneve's capabilities should be exposed
> without modifying the code (the one exception being options that require
> per-packet computation in the fast path).
> 
> The main issue with supporting Geneve options is how to integrate the
> fields into the existing OpenFlow pipeline. All existing operations
> are referred to by their NXM/OXM field name - matches, action generation,
> arithmetic operations (i.e. tranfer to a register). However, the Geneve
> option space is exactly the same as the OXM space, so a direct mapping
> is not feasible. Instead, we create a pool of 64 NXMs that are then
> dynamically mapped on Geneve option TLVs using OpenFlow. Once mapped,
> these fields become first-class citizens in the OpenFlow pipeline.
> 
> An example of how to use Geneve options:
> ovs-ofctl add-geneve-map br0 {class=0xffff,type=0,len=4}->tun_metadata0
> ovs-ofctl add-flow br0 
> in_port=LOCAL,actions=set_field:0xffffffff->tun_metadata0,1
> 
> This will add a 4 bytes option (filled will all 1's) to all packets
> coming from the LOCAL port and then send then out to port 1.
> 
> A limitation of this patch is that although the option table is specified
> for a particular switch over OpenFlow, it is currently global to all
> switches. This will be addressed in a future patch.

Out of curiosity, when you say that do you mean "This is left as future
work" or "I'm cleaning up a patch as we speak" or somewhere in between?

> Based on work originally done by Madhu Challa.
> 
> Signed-off-by: Jesse Gross <je...@nicira.com>

Among this series, this patch is clearly the money shot.


Bugs
----

Building on i386, this makes the compiler really angry at me because
struct tun_metadata is an odd multiple of 32 bits in size, which makes
struct flow an odd multiple of 32 bits in size and fails the flow.h
assertion
BUILD_ASSERT_DECL(sizeof(struct flow) % sizeof(uint64_t) == 0);

This fixes it but obviously it's not the right fix:

diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h
index 8ba0757..4e1e84a 100644
--- a/lib/tun-metadata.h
+++ b/lib/tun-metadata.h
@@ -37,6 +37,7 @@ struct tun_metadata {
     uint8_t opts[TUN_METADATA_TOT_OPT_SIZE];
     uint64_t opt_map;
     struct tun_table *tab;
+    uint8_t pad[4];
 };
 BUILD_ASSERT_DECL(sizeof(((struct tun_metadata *)0)->opt_map) * 8 >=
                   TUN_METADATA_NUM_OPTS);

I see a lot of use of "1 << (something)" in tun-metadata, but I believe
that these need to use 1ULL or UINT64_C(1) instead.

I'm a little confused about the alloc_offset field in struct
tun_metadata_allocation.  It seems to be a byte offset but
metadata_loc_from_match() compares it against TUN_METADATA_NUM_OPTS.  Is
that a typo for TUN_METADATA_TOT_OPT_SIZE?


Other
-----

tun-metadata.c has a few hard tabs.

Fragmented tunnel metadata is nasty.  Is there a way we can avoid it?

I like the implementation approach used in ULONG_FOR_EACH_1 better than
the one in PRESENT_OPT_FOR_EACH, since the user doesn't have to provide
a scratch variable.  Also it protects the macro arguments better.
Actually ULONG_FOR_EACH_1 could be made to work with 32- or 64-bit maps
just by changing "unsigned long" to "uint64_t" in its definition; maybe
we should.

I'm not certain whether the tun_metadata layer intends to limit itself
to option values that are a multiple of 4 bytes (as Geneve has) or
whether it's supposed to be an abstraction that doesn't care.

tun_meta_find_key() could probably be a tiny bit faster by switching
from HMAP_FOR_EACH_WITH_HASH to HMAP_FOR_EACH_IN_BUCKET.

I spent some of my time studying the code by annotating it with
comments.  I'm appending a patch, hope you'll adopt a lot of it.

Acked-by: Ben Pfaff <b...@nicira.com>

--8<--------------------------cut here-------------------------->8--

diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c
index b09085e..eb7724c 100644
--- a/lib/tun-metadata.c
+++ b/lib/tun-metadata.c
@@ -31,16 +31,22 @@
 #include "tun-metadata.h"
 
 struct tun_meta_entry {
-    struct hmap_node node;
-    uint32_t key; /* Class and Type */
+    struct hmap_node node;      /* In struct tun_table's key_hmap. */
+    uint32_t key;               /* (class << 16) | type. */
     struct tun_metadata_loc loc;
-    bool valid;
+    bool valid;                 /* True if allocated to a class and type. */
 };
 
+/* Maps from Geneve option class+type to positions in a struct tun_metadata's
+ * 'opts' array.  */
 struct tun_table {
+    /* TUN_METADATA<i> is stored in element <i>. */
     struct tun_meta_entry entries[TUN_METADATA_NUM_OPTS];
-    /* Each bit represents 4 bytes of allocated space */
+
+    /* Each bit represents 4 bytes of space, 0-bits are free space. */
     unsigned long alloc_map[BITMAP_N_LONGS(TUN_METADATA_TOT_OPT_SIZE / 4)];
+
+    /* The valid elements in entries[], indexed by class+type. */
     struct hmap key_hmap;
 };
 BUILD_ASSERT_DECL(TUN_METADATA_TOT_OPT_SIZE % 4 == 0);
@@ -81,6 +87,8 @@ tun_key_type(uint32_t key)
     return key & 0xff;
 }
 
+/* Returns a newly allocated tun_table.  If 'old_map' is nonnull then the new
+ * tun_table is a deep copy of the old one. */
 static struct tun_table *
 table_alloc(const struct tun_table *old_map) OVS_REQUIRES(tab_mutex)
 {
@@ -114,6 +122,7 @@ table_alloc(const struct tun_table *old_map) 
OVS_REQUIRES(tab_mutex)
     return new_map;
 }
 
+/* Frees 'map' and all the memory it owns. */
 static void
 table_free(struct tun_table *map) OVS_REQUIRES(tab_mutex)
 {
@@ -130,6 +139,7 @@ table_free(struct tun_table *map) OVS_REQUIRES(tab_mutex)
     free(map);
 }
 
+/* Creates a global tunnel metadata mapping table, if none already exists. */
 void
 tun_metadata_init(void)
 {
@@ -221,6 +231,13 @@ tun_metadata_table_request(struct 
ofputil_geneve_table_reply *gtr)
     }
 }
 
+/* Copies the value of field 'mf' from 'metadata' into 'value'.
+ *
+ * 'mf' must be an MFF_TUN_METADATA* field.
+ *
+ * This uses the global tunnel metadata mapping table created by
+ * tun_metadata_init().  If no such table has been created or if 'mf' hasn't
+ * been allocated in it yet, this just zeros 'value'. */
 void
 tun_metadata_read(const struct tun_metadata *metadata,
                   const struct mf_field *mf, union mf_value *value)
@@ -241,6 +258,13 @@ tun_metadata_read(const struct tun_metadata *metadata,
                          metadata, loc);
 }
 
+/* Copies 'value' into field 'mf' in 'metadata'.
+ *
+ * 'mf' must be an MFF_TUN_METADATA* field.
+ *
+ * This uses the global tunnel metadata mapping table created by
+ * tun_metadata_init().  If no such table has been created or if 'mf' hasn't
+ * been allocated in it yet, this function does nothing. */
 void
 tun_metadata_write(struct tun_metadata *metadata,
                    const struct mf_field *mf, const union mf_value *value)
@@ -289,6 +313,21 @@ metadata_loc_from_match(struct tun_table *map, struct 
match *match,
     return &match->tun_md.loc[idx];
 }
 
+/* Makes 'match' match 'value'/'mask' on field 'mf'.
+ *
+ * 'mf' must be an MFF_TUN_METADATA* field.
+ *
+ * If there is global tunnel metadata matching table, this function is
+ * effective only if there is already a mapping for 'mf'.  Otherwise, the
+ * metadata mapping table integrated into 'match' is used, adding 'mf' to its
+ * mapping table if it isn't already mapped (and if there is room).  If 'mf'
+ * isn't or can't be mapped, this function returns without modifying 'match'.
+ *
+ * 'value' may be NULL; if so, then 'mf' is made to match on an all-zeros
+ * value.
+ *
+ * 'mask' may be NULL; if so, then 'mf' is made exact-match.
+ */
 void
 tun_metadata_set_match(const struct mf_field *mf, const union mf_value *value,
                        const union mf_value *mask, struct match *match)
@@ -333,6 +372,7 @@ tun_metadata_set_match(const struct mf_field *mf, const 
union mf_value *value,
     memcpy_to_metadata(&match->wc.masks.tunnel.metadata, data.tun_metadata, 
loc);
 }
 
+/* Copies all MFF_TUN_METADATA* fields from 'metadata' to 'flow_metadata'. */
 void
 tun_metadata_get_fmd(const struct tun_metadata *metadata,
                      struct match *flow_metadata)
diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h
index 8ba0757..28ad9e4 100644
--- a/lib/tun-metadata.h
+++ b/lib/tun-metadata.h
@@ -33,10 +33,18 @@ struct tun_table;
 
 #define TUN_METADATA_NUM_OPTS 64
 #define TUN_METADATA_TOT_OPT_SIZE 256
+
+/* Geneve tunnel option data, plus metadata to aid in their interpretation.
+ *
+ * 'opt_map' is indexed by type, that is, by the <i> in TUN_METADATA<i>, so
+ * that e.g. TUN_METADATA5 is present if 'opt_map & (1ULL << 5)' is nonzero.
+ * The actual data for TUN_METADATA5, if present, might be anywhere in 'opts'
+ * (not necessarily even contiguous), and finding it requires referring to
+ * 'tab'. */
 struct tun_metadata {
-    uint8_t opts[TUN_METADATA_TOT_OPT_SIZE];
-    uint64_t opt_map;
-    struct tun_table *tab;
+    uint8_t opts[TUN_METADATA_TOT_OPT_SIZE]; /* Values from tunnel TLVs. */
+    uint64_t opt_map;                        /* 1-bit for each present TLV. */
+    struct tun_table *tab;      /* Types & lengths for 'opts' and 'opt_map'. */
 };
 BUILD_ASSERT_DECL(sizeof(((struct tun_metadata *)0)->opt_map) * 8 >=
                   TUN_METADATA_NUM_OPTS);
@@ -46,16 +54,16 @@ BUILD_ASSERT_DECL(sizeof(((struct tun_metadata 
*)0)->opt_map) * 8 >=
  * linked list of these blocks. */
 struct tun_metadata_loc_chain {
     struct tun_metadata_loc_chain *next;
-    uint8_t offset;
-    uint8_t len;
+    uint8_t offset;       /* In bytes, from start of 'opts', multiple of 4.  */
+    uint8_t len;          /* In bytes, multiple of 4. */
 };
 
 struct tun_metadata_loc {
-    int len;
+    int len;                    /* Sum of 'len' over elements in chain. */
     struct tun_metadata_loc_chain c;
 };
 
-/* Allocation of options to be used inside a match structure. This is
+/* Allocation of options inside struct match.  This is
  * important if we don't have access to a global allocation table - either
  * because there isn't one (ovs-ofctl) or if we need to keep the allocation
  * outside of packet processing context (Packet-In). These structures never
@@ -63,8 +71,8 @@ struct tun_metadata_loc {
  * fragmented. */
 struct tun_metadata_allocation {
     struct tun_metadata_loc loc[TUN_METADATA_NUM_OPTS];
-    uint8_t alloc_offset;
-    bool valid;
+    uint8_t alloc_offset;       /* Byte offset into 'opts', multiple of 4.  */
+    bool valid;                 /* Set to true after any allocation occurs. */
 };
 
 void tun_metadata_init(void);
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to