This commit adds support for Geneve flow mods via two nicira extensions. NXM_NX_TUN_METADATA for being able to match on tun_metadata nxms and NXAST_RAW_TUN_METADATA for tun_metadata actions.
Matching ======== On the match side multiple NXM_NX_TUN_METADATA nxms can appear in a flow mod in any order. tun_metadata options are learned as flows get created. The option space is shared between all the flows for both match and actions. Since a flow can contain in the worst case all possible options the max space reserved for the options is limited to 255 bytes. tun_metadata options are hashed into two tables. 1. maps (class, type) -> (len, offset) Duplicates are not allowed, this the len is unique to a particular (class,type) 2. maps (offset) -> (class,type,len) The flow stores the oxm data as it appears (without the length). Given this and the fact that the metadata in the flow can contain holes the second table is used to figure out how much to skip to decode the next option. One of the side effect of maintaining the tables is that commands like ovs-dpctl that directly talke to datapath do not have access to the table and cannot use it to dump. Since the dp flow was already created and the netlink message contains the length of the option this issue can be handled without needing to consult the table. I plan to implement this via a separate commit. Two new commands are added to ovs-appctl to be able to dump and flush the table. tnl/meta/flush tnl/meta/show The other side affect of keeping a table is that when flows are deleted we cannot necesarily remove and resuse the holes in the option space without additional complexity. I have not yet figured out how to maintain a reference count of the flows that are using a particular entry. ovs-ofctl is expanded to be able to pass multiple tun_metadata options. example: ovs-ofctl add-flow br0 "tun_id=0x32,tun_src=12.1.1.5,tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,in_port=1 actions=output:2" Actions ======= A new action tun_metadata is added. Multiple of these actions can appear in a flowmod. The actions consult the same table to figure out mappings. example: ovs-ofctl add-flow br0 "in_port=2 actions=set_tunnel:0x32,load:0xc010105->NXM_NX_TUN_IPV4_DST[],tun_metadata=1234561122334455667788,tun_metadata=123478aabbccdd,output:1" Usage ===== tun_metadata options are intended to be used for geneve and designed as such. However they provide an alternate way to configure other types of tunnels. Consumers of this api need to provide 1. datapath (like geneve kmod) 2. nlattr_to_tun_metadata: function to convert netlink attributes provided by datapath into tun metadata used for matching. 3. tun_metadata_to_nlattr: function to convert tun metadata into netlink attributes for the particular datapath. todo list: 1. add a flag to flow to optimize check for is_all_zeros 2. handle dpctl dump-flows 3. remove the hardcoded if (nxm_class(header) == 1 && nxm_field(header) == 38) { in nxm_field_by_header(). Signed-off-by: Madhu Challa <cha...@noironetworks.com> --- build-aux/extract-ofp-fields | 6 +- lib/automake.mk | 2 + lib/flow.c | 13 +- lib/flow.h | 4 +- lib/match.c | 22 ++- lib/meta-flow.c | 152 +++++++++++++++-- lib/meta-flow.h | 48 ++++-- lib/nx-match.c | 64 ++++++- lib/odp-util.c | 20 ++- lib/odp-util.h | 2 +- lib/ofp-actions.c | 78 ++++++++- lib/ofp-actions.h | 10 ++ lib/ofp-parse.c | 7 +- lib/ofp-util.c | 2 +- lib/packets.h | 3 + lib/tun-metadata.c | 384 ++++++++++++++++++++++++++++++++++++++++++ lib/tun-metadata.h | 44 +++++ ofproto/ofproto-dpif-xlate.c | 15 +- tests/ofproto.at | 7 +- 19 files changed, 820 insertions(+), 63 deletions(-) create mode 100644 lib/tun-metadata.c create mode 100644 lib/tun-metadata.h diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields index b15b01d..415b9e3 100755 --- a/build-aux/extract-ofp-fields +++ b/build-aux/extract-ofp-fields @@ -19,7 +19,8 @@ TYPES = {"u8": 1, "be32": 4, "MAC": 6, "be64": 8, - "IPv6": 16} + "IPv6": 16, + "bytestring": 255} FORMATTING = {"decimal": ("MFS_DECIMAL", 1, 8), "hexadecimal": ("MFS_HEXADECIMAL", 1, 8), @@ -30,7 +31,8 @@ FORMATTING = {"decimal": ("MFS_DECIMAL", 1, 8), "OpenFlow 1.1+ port": ("MFS_OFP_PORT_OXM", 4, 4), "frag": ("MFS_FRAG", 1, 1), "tunnel flags": ("MFS_TNL_FLAGS", 2, 2), - "TCP flags": ("MFS_TCP_FLAGS", 2, 2)} + "TCP flags": ("MFS_TCP_FLAGS", 2, 2), + "tun_metadata": ("MFS_TUN_METADATA", 1, 255)} PREREQS = {"none": "MFP_NONE", "ARP": "MFP_ARP", diff --git a/lib/automake.mk b/lib/automake.mk index 2a5844b..c112c59 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -245,6 +245,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/tnl-ports.c \ lib/tnl-ports.h \ lib/token-bucket.c \ + lib/tun-metadata.c \ + lib/tun-metadata.h \ lib/type-props.h \ lib/unaligned.h \ lib/unicode.c \ diff --git a/lib/flow.c b/lib/flow.c index 43bb003..d963e0b 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -119,7 +119,7 @@ struct mf_ctx { * away. Some GCC versions gave warnings on ALWAYS_INLINE, so these are * defined as macros. */ -#if (FLOW_WC_SEQ != 30) +#if (FLOW_WC_SEQ != 31) #define MINIFLOW_ASSERT(X) ovs_assert(X) BUILD_MESSAGE("FLOW_WC_SEQ changed: miniflow_extract() will have runtime " "assertions enabled. Consider updating FLOW_WC_SEQ after " @@ -762,7 +762,7 @@ flow_unwildcard_tp_ports(const struct flow *flow, struct flow_wildcards *wc) void flow_get_metadata(const struct flow *flow, struct flow_metadata *fmd) { - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); fmd->dp_hash = flow->dp_hash; fmd->recirc_id = flow->recirc_id; @@ -909,7 +909,7 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc, memset(&wc->masks, 0x0, sizeof wc->masks); /* Update this function whenever struct flow changes. */ - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); if (flow->tunnel.ip_dst) { if (flow->tunnel.flags & FLOW_TNL_F_KEY) { @@ -922,6 +922,7 @@ void flow_wildcards_init_for_packet(struct flow_wildcards *wc, WC_MASK_FIELD(wc, tunnel.ip_ttl); WC_MASK_FIELD(wc, tunnel.tp_src); WC_MASK_FIELD(wc, tunnel.tp_dst); + WC_MASK_FIELD(wc, tunnel.metadata); } else if (flow->tunnel.tun_id) { WC_MASK_FIELD(wc, tunnel.tun_id); } @@ -1006,7 +1007,7 @@ uint64_t flow_wc_map(const struct flow *flow) { /* Update this function whenever struct flow changes. */ - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); uint64_t map = (flow->tunnel.ip_dst) ? MINIFLOW_MAP(tunnel) : 0; @@ -1058,7 +1059,7 @@ void flow_wildcards_clear_non_packet_fields(struct flow_wildcards *wc) { /* Update this function whenever struct flow changes. */ - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); memset(&wc->masks.metadata, 0, sizeof wc->masks.metadata); memset(&wc->masks.regs, 0, sizeof wc->masks.regs); @@ -1617,7 +1618,7 @@ flow_push_mpls(struct flow *flow, int n, ovs_be16 mpls_eth_type, flow->mpls_lse[0] = set_mpls_lse_values(ttl, tc, 1, htonl(label)); /* Clear all L3 and L4 fields and dp_hash. */ - BUILD_ASSERT(FLOW_WC_SEQ == 30); + BUILD_ASSERT(FLOW_WC_SEQ == 31); memset((char *) flow + FLOW_SEGMENT_2_ENDS_AT, 0, sizeof(struct flow) - FLOW_SEGMENT_2_ENDS_AT); flow->dp_hash = 0; diff --git a/lib/flow.h b/lib/flow.h index dd989ee..8c42eeb 100644 --- a/lib/flow.h +++ b/lib/flow.h @@ -38,7 +38,7 @@ struct pkt_metadata; /* This sequence number should be incremented whenever anything involving flows * or the wildcarding of flows changes. This will cause build assertion * failures in places which likely need to be updated. */ -#define FLOW_WC_SEQ 30 +#define FLOW_WC_SEQ 31 /* Number of Open vSwitch extension 32-bit registers. */ #define FLOW_N_REGS 8 @@ -156,7 +156,7 @@ BUILD_ASSERT_DECL(sizeof(struct flow) % sizeof(uint64_t) == 0); /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */ BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t) == sizeof(struct flow_tnl) + 192 - && FLOW_WC_SEQ == 30); + && FLOW_WC_SEQ == 31); /* Incremental points at which flow classification may be performed in * segments. diff --git a/lib/match.c b/lib/match.c index 76ccb43..34a76e6 100644 --- a/lib/match.c +++ b/lib/match.c @@ -21,6 +21,7 @@ #include "dynamic-string.h" #include "ofp-util.h" #include "packets.h" +#include "tun-metadata.h" /* Converts the flow in 'flow' into a match in 'match', with the given * 'wildcards'. */ @@ -862,6 +863,25 @@ format_flow_tunnel(struct ds *s, const struct match *match) format_flags(s, flow_tun_flag_to_string, tnl->flags, '|'); ds_put_char(s, ','); } + if (!is_all_zeros(wc->masks.tunnel.metadata, TUN_METADATA_LEN)) { + uint16_t len, ofs = 0; + int i; + while (tun_metadata_get_len(ofs, &len)) { + if (tun_metadata_valid(tnl->metadata + ofs, len, ofs)) { + ds_put_format(s, "tun_metadata="); + for (i = 0; i < len; i++) { + ds_put_format(s, "%02"SCNx8, tnl->metadata[ofs + i]); + } + ds_put_char(s, '/'); + for (i = 0; i < len; i++) { + ds_put_format(s, "%02"SCNx8, wc->masks.tunnel.metadata[ofs + i]); + } + ds_put_char(s, ','); + } + ofs += len; + ovs_assert(ofs <= TUN_METADATA_LEN); + } + } } /* Appends a string representation of 'match' to 's'. If 'priority' is @@ -877,7 +897,7 @@ match_format(const struct match *match, struct ds *s, int priority) int i; - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); if (priority != OFP_DEFAULT_PRIORITY) { ds_put_format(s, "priority=%d,", priority); diff --git a/lib/meta-flow.c b/lib/meta-flow.c index 9ce4cfe..6a404ca 100644 --- a/lib/meta-flow.c +++ b/lib/meta-flow.c @@ -35,6 +35,7 @@ #include "socket-util.h" #include "unaligned.h" #include "util.h" +#include "tun-metadata.h" #include "openvswitch/vlog.h" VLOG_DEFINE_THIS_MODULE(meta_flow); @@ -60,6 +61,10 @@ static struct shash mf_by_name; static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); static void nxm_init(void); +static void mf_initialize_exact_mask(union mf_value *value) +{ + memset(value, 0xff, sizeof *value); +} /* Returns the field with the given 'name', or a null pointer if no field has * that name. */ @@ -217,6 +222,8 @@ mf_is_all_wild(const struct mf_field *mf, const struct flow_wildcards *wc) return !wc->masks.tp_dst; case MFF_TCP_FLAGS: return !wc->masks.tcp_flags; + case MFF_TUN_METADATA: + return is_all_zeros(wc->masks.tunnel.metadata, TUN_METADATA_LEN); case MFF_N_IDS: default: @@ -314,8 +321,8 @@ mf_are_prereqs_ok(const struct mf_field *mf, const struct flow *flow) void mf_mask_field_and_prereqs(const struct mf_field *mf, struct flow *mask) { - static const union mf_value exact_match_mask = MF_EXACT_MASK_INITIALIZER; - + static union mf_value exact_match_mask; + mf_initialize_exact_mask(&exact_match_mask); mf_set_flow_value(mf, &exact_match_mask, mask); switch (mf->prereqs) { @@ -405,6 +412,7 @@ mf_is_value_valid(const struct mf_field *mf, const union mf_value *value) case MFF_ND_TARGET: case MFF_ND_SLL: case MFF_ND_TLL: + case MFF_TUN_METADATA: return true; case MFF_IN_PORT_OXM: @@ -655,6 +663,10 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow, value->ipv6 = flow->nd_target; break; + case MFF_TUN_METADATA: + memcpy(value->tun_metadata, flow->tunnel.metadata, TUN_METADATA_LEN); + break; + case MFF_N_IDS: default: OVS_NOT_REACHED(); @@ -666,7 +678,7 @@ mf_get_value(const struct mf_field *mf, const struct flow *flow, * prerequisites. */ void mf_set_value(const struct mf_field *mf, - const union mf_value *value, struct match *match) + const union mf_value *value, struct match *match, int len) { switch (mf->id) { case MFF_DP_HASH: @@ -866,6 +878,10 @@ mf_set_value(const struct mf_field *mf, match_set_icmp_code(match, value->u8); break; + case MFF_TUN_METADATA: + match_set_tun_metadata(match, value->tun_metadata, len); + break; + case MFF_ND_TARGET: match_set_nd_target(match, &value->ipv6); break; @@ -881,7 +897,8 @@ mf_set_value(const struct mf_field *mf, void mf_mask_field(const struct mf_field *mf, struct flow *mask) { - static const union mf_value exact_match_mask = MF_EXACT_MASK_INITIALIZER; + static union mf_value exact_match_mask; + mf_initialize_exact_mask(&exact_match_mask); /* For MFF_DL_VLAN, we cannot send a all 1's to flow_set_dl_vlan() * as that will be considered as OFP10_VLAN_NONE. So consider it as a @@ -900,6 +917,7 @@ void mf_set_flow_value(const struct mf_field *mf, const union mf_value *value, struct flow *flow) { + uint16_t len, ofs; switch (mf->id) { case MFF_DP_HASH: flow->dp_hash = ntohl(value->be32); @@ -1099,6 +1117,12 @@ mf_set_flow_value(const struct mf_field *mf, flow->nd_target = value->ipv6; break; + case MFF_TUN_METADATA: + if (tun_metadata_get_lenofs(value->tun_metadata, &len, &ofs)) { + memcpy(flow->tunnel.metadata + ofs, value->tun_metadata, len); + } + break; + case MFF_N_IDS: default: OVS_NOT_REACHED(); @@ -1153,7 +1177,9 @@ mf_is_zero(const struct mf_field *mf, const struct flow *flow) * The caller is responsible for ensuring that 'match' meets 'mf''s * prerequisites. */ void -mf_set_wild(const struct mf_field *mf, struct match *match) +mf_set_wild(const struct mf_field *mf, + const union mf_value *value OVS_UNUSED, + struct match *match, int len OVS_UNUSED) { switch (mf->id) { case MFF_DP_HASH: @@ -1357,6 +1383,11 @@ mf_set_wild(const struct mf_field *mf, struct match *match) memset(&match->flow.nd_target, 0, sizeof match->flow.nd_target); break; + case MFF_TUN_METADATA: + memset(match->flow.tunnel.metadata, 0, TUN_METADATA_LEN); + memset(match->wc.masks.tunnel.metadata, 0, TUN_METADATA_LEN); + break; + case MFF_N_IDS: default: OVS_NOT_REACHED(); @@ -1377,13 +1408,19 @@ mf_set_wild(const struct mf_field *mf, struct match *match) enum ofputil_protocol mf_set(const struct mf_field *mf, const union mf_value *value, const union mf_value *mask, - struct match *match) + struct match *match, int len) { - if (!mask || is_all_ones(mask, mf->n_bytes)) { - mf_set_value(mf, value, match); + if (mf->id == MFF_TUN_METADATA) { + ovs_assert(len <= mf->n_bytes); + } else { + ovs_assert(len == mf->n_bytes); + } + + if (!mask || is_all_ones(mask, len)) { + mf_set_value(mf, value, match, len); return mf->usable_protocols_exact; - } else if (is_all_zeros(mask, mf->n_bytes)) { - mf_set_wild(mf, match); + } else if (is_all_zeros(mask, len)) { + mf_set_wild(mf, value, match, len); return OFPUTIL_P_ANY; } @@ -1498,7 +1535,7 @@ mf_set(const struct mf_field *mf, case MFF_IPV6_LABEL: if ((mask->be32 & htonl(IPV6_LABEL_MASK)) == htonl(IPV6_LABEL_MASK)) { - mf_set_value(mf, value, match); + mf_set_value(mf, value, match, mf->n_bytes); } else { match_set_ipv6_label_masked(match, value->be32, mask->be32); } @@ -1536,6 +1573,11 @@ mf_set(const struct mf_field *mf, match_set_tcp_flags_masked(match, value->be16, mask->be16); break; + case MFF_TUN_METADATA: + match_set_tun_metadata_masked(match, value->tun_metadata, + mask->tun_metadata, len); + break; + case MFF_N_IDS: default: OVS_NOT_REACHED(); @@ -1951,14 +1993,66 @@ mf_from_tcp_flags_string(const char *s, ovs_be16 *flagsp, ovs_be16 *maskp) return NULL; } +static char * +mf_from_tun_metadata_string(const struct mf_field *mf, const char *s, + int *len, uint8_t tun_metadata[TUN_METADATA_LEN], + uint8_t mask[TUN_METADATA_LEN]) +{ + unsigned int i,j; + + ovs_assert(mf->n_bytes == TUN_METADATA_LEN - 1); + + for (i = 0; i < TUN_METADATA_LEN; i++) { + if (*s == '/' || *s == '\0') { + break; + } + if (sscanf(s, "%2"SCNx8, &tun_metadata[i]) != 1) { + if (len) { + *len = 0; + } + return xasprintf("tunnel metadata invalid"); + } else { + s += 2; + } + } + + if (len) { + *len = i; + } + + if (i < 3) { + return xasprintf("tunnel metadata too short"); + } + + if (*s == '\0') { + memset(mask, 0xff, i); + return NULL; + } + + s++; + + for (j = 0; j < i; j++) { + if (*s == '\0') { + memset(mask + j, 0, i - j); + break; + } + sscanf(s, "%2"SCNx8, &mask[j]); + s += 2 * sizeof(char); + } + + return NULL; +} /* Parses 's', a string value for field 'mf', into 'value' and 'mask'. Returns * NULL if successful, otherwise a malloc()'d string describing the error. */ char * mf_parse(const struct mf_field *mf, const char *s, - union mf_value *value, union mf_value *mask) + union mf_value *value, union mf_value *mask, int *len) { char *error; + if (len) { + *len = mf->n_bytes; + } if (!strcmp(s, "*")) { memset(value, 0, mf->n_bytes); @@ -2007,6 +2101,11 @@ mf_parse(const struct mf_field *mf, const char *s, error = mf_from_tcp_flags_string(s, &value->be16, &mask->be16); break; + case MFS_TUN_METADATA: + error = mf_from_tun_metadata_string(mf, s, len, value->tun_metadata, + mask->tun_metadata); + break; + default: OVS_NOT_REACHED(); } @@ -2024,13 +2123,14 @@ mf_parse_value(const struct mf_field *mf, const char *s, union mf_value *value) { union mf_value mask; char *error; + int len; - error = mf_parse(mf, s, value, &mask); + error = mf_parse(mf, s, value, &mask, &len); if (error) { return error; } - if (!is_all_ones((const uint8_t *) &mask, mf->n_bytes)) { + if (!is_all_ones((const uint8_t *) &mask, len)) { return xasprintf("%s: wildcards not allowed here", s); } return NULL; @@ -2100,6 +2200,23 @@ mf_format_tcp_flags_string(ovs_be16 value, ovs_be16 mask, struct ds *s) TCP_FLAGS(mask)); } +static void +mf_format_tun_metadata_string(const uint8_t tun_metadata[TUN_METADATA_LEN], + const uint8_t mask[TUN_METADATA_LEN], + struct ds *s) +{ + unsigned int i; + for (i = 0; i < TUN_METADATA_LEN; i++) { + ds_put_format(s, "%02x", tun_metadata[i]); + } + if (mask == NULL) + return; + ds_put_char(s, '/'); + for (i = 0; i < TUN_METADATA_LEN; i++) { + ds_put_format(s, "%02x", mask[i]); + } +} + /* Appends to 's' a string representation of field 'mf' whose value is in * 'value' and 'mask'. 'mask' may be NULL to indicate an exact match. */ void @@ -2161,6 +2278,11 @@ mf_format(const struct mf_field *mf, mask ? mask->be16 : OVS_BE16_MAX, s); break; + case MFS_TUN_METADATA: + mf_format_tun_metadata_string(value->tun_metadata, mask->tun_metadata, + s); + break; + default: OVS_NOT_REACHED(); } @@ -2195,7 +2317,7 @@ mf_write_subfield(const struct mf_subfield *sf, const union mf_subvalue *x, mf_get(field, match, &value, &mask); bitwise_copy(x, sizeof *x, 0, &value, field->n_bytes, sf->ofs, sf->n_bits); bitwise_one ( &mask, field->n_bytes, sf->ofs, sf->n_bits); - mf_set(field, &value, &mask, match); + mf_set(field, &value, &mask, match, field->n_bytes); } /* Initializes 'x' to the value of 'sf' within 'flow'. 'sf' must be valid for diff --git a/lib/meta-flow.h b/lib/meta-flow.h index 4a6c443..4646650 100644 --- a/lib/meta-flow.h +++ b/lib/meta-flow.h @@ -1377,6 +1377,27 @@ enum OVS_PACKED_ENUM mf_field_id { */ MFF_ND_TLL, + /* + * "tun_metadata". + * + * Encap metadata for tunnels. + * + * Each NXM can carry upto 255 bytes of encap metadata when not masked or + * upto 127 bytes of encap metadata followed by equal length mask when + * masked. + * + * Type: bytestring. + * Maskable: bitwise. + * Formatting: tun_metadata. + * Prerequisites: none. + * Access: read/write. + * NXM: NXM_NX_TUN_METADATA(38) since v2.4. + * OXM: none. + * Prefix lookup member: tunnel.metadata. + * + */ + MFF_TUN_METADATA, + MFF_N_IDS }; @@ -1462,6 +1483,7 @@ enum OVS_PACKED_ENUM mf_string { MFS_FRAG, /* no, yes, first, later, not_later */ MFS_TNL_FLAGS, /* FLOW_TNL_F_* flags */ MFS_TCP_FLAGS, /* TCP_* flags */ + MFS_TUN_METADATA, /* tunnel metadata upto 255 bytes */ }; struct mf_field { @@ -1519,12 +1541,10 @@ union mf_value { ovs_be32 be32; ovs_be16 be16; uint8_t u8; + uint8_t tun_metadata[TUN_METADATA_LEN]; }; -BUILD_ASSERT_DECL(sizeof(union mf_value) == 16); - -/* An all-1-bits mf_value. Needs to be updated if struct mf_value grows.*/ -#define MF_EXACT_MASK_INITIALIZER { IN6ADDR_EXACT_INIT } -BUILD_ASSERT_DECL(sizeof(union mf_value) == sizeof(struct in6_addr)); +BUILD_ASSERT_DECL(sizeof(union mf_value) == TUN_METADATA_LEN); +BUILD_ASSERT_DECL(TUN_METADATA_LEN >= 16); /* Part of a field. */ struct mf_subfield { @@ -1539,12 +1559,13 @@ struct mf_subfield { * value" contains NXM_OF_VLAN_TCI[0..11], then one could access the * corresponding data in value.be16[7] as the bits in the mask htons(0xfff). */ union mf_subvalue { - uint8_t u8[16]; - ovs_be16 be16[8]; - ovs_be32 be32[4]; - ovs_be64 be64[2]; + uint8_t u8[TUN_METADATA_LEN]; + ovs_be16 be16[TUN_METADATA_LEN >> 1]; + ovs_be32 be32[TUN_METADATA_LEN >> 2]; + ovs_be64 be64[TUN_METADATA_LEN >> 3]; }; BUILD_ASSERT_DECL(sizeof(union mf_value) == sizeof (union mf_subvalue)); +BUILD_ASSERT_DECL(TUN_METADATA_LEN % 8 == 0); /* Finding mf_fields. */ const struct mf_field *mf_from_name(const char *name); @@ -1580,7 +1601,7 @@ bool mf_is_value_valid(const struct mf_field *, const union mf_value *value); void mf_get_value(const struct mf_field *, const struct flow *, union mf_value *value); void mf_set_value(const struct mf_field *, const union mf_value *value, - struct match *); + struct match *, int); void mf_set_flow_value(const struct mf_field *, const union mf_value *value, struct flow *); void mf_set_flow_value_masked(const struct mf_field *, @@ -1597,9 +1618,10 @@ void mf_get(const struct mf_field *, const struct match *, enum ofputil_protocol mf_set(const struct mf_field *, const union mf_value *value, const union mf_value *mask, - struct match *); + struct match *, int); -void mf_set_wild(const struct mf_field *, struct match *); +void mf_set_wild(const struct mf_field *, const union mf_value *value, + struct match *, int); /* Subfields. */ void mf_write_subfield_flow(const struct mf_subfield *, @@ -1617,7 +1639,7 @@ enum ofperr mf_check_dst(const struct mf_subfield *, const struct flow *); /* Parsing and formatting. */ char *mf_parse(const struct mf_field *, const char *, - union mf_value *value, union mf_value *mask); + union mf_value *value, union mf_value *mask, int *); char *mf_parse_value(const struct mf_field *, const char *, union mf_value *); void mf_format(const struct mf_field *, const union mf_value *value, const union mf_value *mask, diff --git a/lib/nx-match.c b/lib/nx-match.c index 114c35b..6af5784 100644 --- a/lib/nx-match.c +++ b/lib/nx-match.c @@ -33,6 +33,7 @@ #include "shash.h" #include "unaligned.h" #include "util.h" +#include "tun-metadata.h" #include "openvswitch/vlog.h" VLOG_DEFINE_THIS_MODULE(nx_match); @@ -415,14 +416,13 @@ nx_pull_header(struct ofpbuf *b, const struct mf_field **field, bool *masked) } static enum ofperr -nx_pull_match_entry(struct ofpbuf *b, bool allow_cookie, +nx_pull_match_entry(struct ofpbuf *b, bool allow_cookie, uint64_t *header, const struct mf_field **field, union mf_value *value, union mf_value *mask) { enum ofperr error; - uint64_t header; - error = nx_pull_entry__(b, allow_cookie, &header, field, value, mask); + error = nx_pull_entry__(b, allow_cookie, header, field, value, mask); if (error) { return error; } @@ -459,8 +459,10 @@ nx_pull_raw(const uint8_t *p, unsigned int match_len, bool strict, union mf_value value; union mf_value mask; enum ofperr error; + uint64_t header; - error = nx_pull_match_entry(&b, cookie != NULL, &field, &value, &mask); + error = nx_pull_match_entry(&b, cookie != NULL, &header, + &field, &value, &mask); if (error) { if (error == OFPERR_OFPBMC_BAD_FIELD && !strict) { continue; @@ -476,10 +478,12 @@ nx_pull_raw(const uint8_t *p, unsigned int match_len, bool strict, } } else if (!mf_are_prereqs_ok(field, &match->flow)) { error = OFPERR_OFPBMC_BAD_PREREQ; - } else if (!mf_is_all_wild(field, &match->wc)) { + } else if (field->id != MFF_TUN_METADATA && + !mf_is_all_wild(field, &match->wc)) { error = OFPERR_OFPBMC_DUP_FIELD; } else { - mf_set(field, &value, &mask, match); + mf_set(field, &value, &mask, match, + MIN(field->n_bytes, nxm_field_bytes(header))); } if (error) { @@ -619,6 +623,25 @@ nxm_put(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version, } } +/* Behaves same as nxm_put except builds a header of length n_bytes */ +static void +nxm_put_variable_len(struct ofpbuf *b, enum mf_field_id field, + enum ofp_version version, const void *value, + const void *mask, size_t n_bytes) +{ + uint64_t header = mf_oxm_header(field, version); + if (!is_all_zeros(mask, n_bytes)) { + bool masked = !is_all_ones(mask, n_bytes); + header = NXM_HEADER(nxm_vendor(header), nxm_class(header), + nxm_field(header), masked, n_bytes); + nx_put_header__(b, header, masked); + ofpbuf_put(b, value, n_bytes); + if (masked) { + ofpbuf_put(b, mask, n_bytes); + } + } +} + static void nxm_put_8m(struct ofpbuf *b, enum mf_field_id field, enum ofp_version version, uint8_t value, uint8_t mask) @@ -795,6 +818,25 @@ nxm_put_ip(struct ofpbuf *b, const struct match *match, enum ofp_version oxm) } } +static void +nxm_put_tun_metadata(struct ofpbuf *b, enum ofp_version oxm, + const struct match *match) +{ + uint16_t len, ofs = 0; + const struct flow *flow = &match->flow; + const uint8_t *metadata = flow->tunnel.metadata; + const uint8_t *mask = match->wc.masks.tunnel.metadata; + + while (tun_metadata_get_len(ofs, &len)) { + if (tun_metadata_valid(metadata + ofs, len, ofs)) { + nxm_put_variable_len(b, MFF_TUN_METADATA, oxm, + metadata + ofs, mask + ofs, len); + } + ofs += len; + } + ovs_assert(ofs <= TUN_METADATA_LEN); +} + /* Appends to 'b' the nx_match format that expresses 'match'. For Flow Mod and * Flow Stats Requests messages, a 'cookie' and 'cookie_mask' may be supplied. * Otherwise, 'cookie_mask' should be zero. @@ -817,7 +859,7 @@ nx_put_raw(struct ofpbuf *b, enum ofp_version oxm, const struct match *match, int match_len; int i; - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); /* Metadata. */ if (match->wc.masks.dp_hash) { @@ -926,6 +968,7 @@ nx_put_raw(struct ofpbuf *b, enum ofp_version oxm, const struct match *match, flow->tunnel.ip_src, match->wc.masks.tunnel.ip_src); nxm_put_32m(b, MFF_TUN_DST, oxm, flow->tunnel.ip_dst, match->wc.masks.tunnel.ip_dst); + nxm_put_tun_metadata(b, oxm, match); /* Registers. */ if (oxm < OFP15_VERSION) { @@ -1778,7 +1821,12 @@ nxm_field_by_header(uint64_t header) const struct nxm_field_index *nfi; nxm_init(); - if (nxm_hasmask(header)) { + /* tun metadata is variable length and we need to correct + * the length before we can do a hash lookup */ + if (nxm_class(header) == 1 && nxm_field(header) == 38) { + header = NXM_HEADER(nxm_vendor(header), nxm_class(header), + nxm_field(header), 0, TUN_METADATA_LEN - 1); + } else if (nxm_hasmask(header)) { header = nxm_make_exact_header(header); } diff --git a/lib/odp-util.c b/lib/odp-util.c index 37d73a9..37e2ec3 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -38,6 +38,7 @@ #include "unaligned.h" #include "util.h" #include "openvswitch/vlog.h" +#include "tun-metadata.h" VLOG_DEFINE_THIS_MODULE(odp_util); @@ -1271,7 +1272,7 @@ tunnel_key_attr_len(int type) } #define GENEVE_OPT(class, type) ((OVS_FORCE uint32_t)(class) << 8 | (type)) -static int +static int OVS_UNUSED parse_geneve_opts(const struct nlattr *attr) { int opts_len = nl_attr_get_size(attr); @@ -1354,12 +1355,11 @@ odp_tun_key_from_attr(const struct nlattr *attr, struct flow_tnl *tun) tun->flags |= FLOW_TNL_F_OAM; break; case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS: { - if (parse_geneve_opts(a)) { - return ODP_FIT_ERROR; - } /* It is necessary to reproduce options exactly (including order) * so it's easiest to just echo them back. */ - unknown = true; + if (geneve_nlattr_to_tun_metadata(a, tun->metadata)) { + return ODP_FIT_ERROR; + } break; } default: @@ -1383,6 +1383,8 @@ static void tun_key_to_attr(struct ofpbuf *a, const struct flow_tnl *tun_key) { size_t tun_key_ofs; + uint8_t tun_metadata[TUN_METADATA_LEN]; + int len; tun_key_ofs = nl_msg_start_nested(a, OVS_KEY_ATTR_TUNNEL); @@ -1415,7 +1417,13 @@ tun_key_to_attr(struct ofpbuf *a, const struct flow_tnl *tun_key) if (tun_key->flags & FLOW_TNL_F_OAM) { nl_msg_put_flag(a, OVS_TUNNEL_KEY_ATTR_OAM); } - + if (!is_all_zeros(tun_key->metadata, TUN_METADATA_LEN)) { + len = tun_metadata_to_geneve_nlattr(tun_key->metadata, tun_metadata); + if (len) { + nl_msg_put_unspec(a, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS, + tun_metadata, len); + } + } nl_msg_end_nested(a, tun_key_ofs); } diff --git a/lib/odp-util.h b/lib/odp-util.h index 178fa11..08dc6cf 100644 --- a/lib/odp-util.h +++ b/lib/odp-util.h @@ -133,7 +133,7 @@ void odp_portno_names_destroy(struct hmap *portno_names); * add another field and forget to adjust this value. */ #define ODPUTIL_FLOW_KEY_BYTES 512 -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); +BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); /* A buffer with sufficient size and alignment to hold an nlattr-formatted flow * key. An array of "struct nlattr" might not, in theory, be sufficiently diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c index e694fd9..70ecf92 100644 --- a/lib/ofp-actions.c +++ b/lib/ofp-actions.c @@ -284,6 +284,9 @@ enum ofp_raw_action_type { /* NX1.0+(34): struct nx_action_conjunction. */ NXAST_RAW_CONJUNCTION, + + /* NX1.0+(35): struct nx_action_tun_metadata, ... */ + NXAST_RAW_TUN_METADATA, }; /* OpenFlow actions are always a multiple of 8 bytes in length. */ @@ -2482,7 +2485,7 @@ set_field_parse__(char *arg, struct ofpbuf *ofpacts, } sf->field = mf; delim[0] = '\0'; - error = mf_parse(mf, value, &sf->value, &sf->mask); + error = mf_parse(mf, value, &sf->value, &sf->mask, NULL); if (error) { return error; } @@ -2692,6 +2695,72 @@ format_STACK_POP(const struct ofpact_stack *a, struct ds *s) nxm_format_stack_pop(a, s); } +/* Action structure for NXAST_TUN_METADATA. */ +struct nx_action_tun_metadata { + ovs_be16 type; + ovs_be16 len; + ovs_be32 vendor; + ovs_be16 subtype; + uint8_t data_len; + uint8_t zero[4]; + uint8_t data[256]; +}; +OFP_ASSERT(sizeof(struct nx_action_tun_metadata) == 272); + +static enum ofperr +decode_NXAST_RAW_TUN_METADATA(const struct nx_action_tun_metadata *natm, + struct ofpbuf *out) +{ + struct ofpact_tun_metadata *otm; + otm = ofpact_put_TUN_METADATA(out); + otm->data_len = natm->data_len; + memcpy(otm->data, natm->data, natm->data_len); + return 0; +} + +static void +encode_TUN_METADATA(const struct ofpact_tun_metadata *otm, + enum ofp_version ofp_version OVS_UNUSED, struct ofpbuf *out) +{ + struct nx_action_tun_metadata *natm = put_NXAST_TUN_METADATA(out); + natm->data_len = otm->data_len; + memcpy(natm->data, otm->data, otm->data_len); +} + +static char * +parse_TUN_METADATA(char *arg, struct ofpbuf *ofpacts, + enum ofputil_protocol *usable_protocols OVS_UNUSED) +{ + struct ofpact_tun_metadata *otm; + const struct mf_field *mf = mf_from_name("tun_metadata"); + union mf_value metadata, mask; + char *error; + int len; + + error = mf_parse(mf, arg, &metadata, &mask, &len); + if (error) { + return error; + } + otm = ofpact_put_TUN_METADATA(ofpacts); + otm->data_len = len; + memcpy(otm->data, metadata.tun_metadata, len); + return NULL; +} + +static void +format_TUN_METADATA(const struct ofpact_tun_metadata *otm, + struct ds *s) +{ + int i; + if (otm->data_len) { + ds_put_format(s, "tun_metadata="); + for (i = 0; i < otm->data_len; i++) { + ds_put_format(s, "%02"SCNx8, otm->data[i]); + } + } +} + + /* Action structure for NXAST_DEC_TTL_CNT_IDS. * * If the packet is not IPv4 or IPv6, does nothing. For IPv4 or IPv6, if the @@ -4747,6 +4816,7 @@ ofpact_is_set_or_move_action(const struct ofpact *a) case OFPACT_STRIP_VLAN: case OFPACT_WRITE_ACTIONS: case OFPACT_WRITE_METADATA: + case OFPACT_TUN_METADATA: return false; default: OVS_NOT_REACHED(); @@ -4806,6 +4876,7 @@ ofpact_is_allowed_in_actions_set(const struct ofpact *a) case OFPACT_SAMPLE: case OFPACT_STACK_POP: case OFPACT_STACK_PUSH: + case OFPACT_TUN_METADATA: /* The action set may only include actions and thus * may not include any instructions */ @@ -5018,6 +5089,7 @@ ovs_instruction_type_from_ofpact_type(enum ofpact_type type) case OFPACT_NOTE: case OFPACT_EXIT: case OFPACT_SAMPLE: + case OFPACT_TUN_METADATA: default: return OVSINST_OFPIT11_APPLY_ACTIONS; } @@ -5607,6 +5679,9 @@ ofpact_check__(enum ofputil_protocol *usable_protocols, struct ofpact *a, case OFPACT_GROUP: return 0; + case OFPACT_TUN_METADATA: + return 0; + default: OVS_NOT_REACHED(); } @@ -6006,6 +6081,7 @@ ofpact_outputs_to_port(const struct ofpact *ofpact, ofp_port_t port) case OFPACT_GOTO_TABLE: case OFPACT_METER: case OFPACT_GROUP: + case OFPACT_TUN_METADATA: default: return false; } diff --git a/lib/ofp-actions.h b/lib/ofp-actions.h index a1a5bb1..29ef969 100644 --- a/lib/ofp-actions.h +++ b/lib/ofp-actions.h @@ -92,6 +92,7 @@ OFPACT(SET_QUEUE, ofpact_queue, ofpact, "set_queue") \ OFPACT(POP_QUEUE, ofpact_null, ofpact, "pop_queue") \ OFPACT(FIN_TIMEOUT, ofpact_fin_timeout, ofpact, "fin_timeout") \ + OFPACT(TUN_METADATA, ofpact_tun_metadata,ofpact, "tun_metadata") \ \ /* Flow table interaction. */ \ OFPACT(RESUBMIT, ofpact_resubmit, ofpact, "resubmit") \ @@ -435,6 +436,15 @@ struct ofpact_fin_timeout { uint16_t fin_hard_timeout; }; +/* OFPACT_TUN_METADATA. + * + * Used for NXAST_TUN_METADATA. */ +struct ofpact_tun_metadata { + struct ofpact ofpact; + uint8_t data_len; + uint8_t data[255]; +}; + /* OFPACT_WRITE_METADATA. * * Used for NXAST_WRITE_METADATA. */ diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c index 9acf6a4..dfded7a 100644 --- a/lib/ofp-parse.c +++ b/lib/ofp-parse.c @@ -218,10 +218,11 @@ parse_field(const struct mf_field *mf, const char *s, struct match *match, { union mf_value value, mask; char *error; + int len; - error = mf_parse(mf, s, &value, &mask); + error = mf_parse(mf, s, &value, &mask, &len); if (!error) { - *usable_protocols &= mf_set(mf, &value, &mask, match); + *usable_protocols &= mf_set(mf, &value, &mask, match, len); } return error; } @@ -1040,7 +1041,7 @@ parse_ofp_exact_flow(struct flow *flow, struct flow *mask, const char *s, goto exit; } - if (!mf_is_zero(mf, flow)) { + if (mf->id != MFF_TUN_METADATA && !mf_is_zero(mf, flow)) { error = xasprintf("%s: field %s set multiple times", s, key); goto exit; } diff --git a/lib/ofp-util.c b/lib/ofp-util.c index 2270a93..6a89e5e 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -186,7 +186,7 @@ ofputil_netmask_to_wcbits(ovs_be32 netmask) void ofputil_wildcard_from_ofpfw10(uint32_t ofpfw, struct flow_wildcards *wc) { - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); /* Initialize most of wc. */ flow_wildcards_init_catchall(wc); diff --git a/lib/packets.h b/lib/packets.h index 0fb0352..be44aa8 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -31,6 +31,8 @@ struct ofpbuf; struct ds; +#define TUN_METADATA_LEN 256 + /* Tunnel information used in flow key and metadata. */ struct flow_tnl { ovs_be64 tun_id; @@ -41,6 +43,7 @@ struct flow_tnl { uint8_t ip_ttl; ovs_be16 tp_src; ovs_be16 tp_dst; + uint8_t metadata[TUN_METADATA_LEN]; }; /* Unfortunately, a "struct flow" sometimes has to handle OpenFlow port diff --git a/lib/tun-metadata.c b/lib/tun-metadata.c new file mode 100644 index 0000000..3748bdf --- /dev/null +++ b/lib/tun-metadata.c @@ -0,0 +1,384 @@ +/* + * Copyright (c) 2015 Cisco systems, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include <config.h> +#include "cmap.h" +#include "compiler.h" +#include "dynamic-string.h" +#include "match.h" +#include "ovs-thread.h" +#include "packets.h" +#include "nx-match.h" +#include "tun-metadata.h" +#include "unixctl.h" +#include "openvswitch/vlog.h" +#include <errno.h> + +VLOG_DEFINE_THIS_MODULE(tun_metadata); +static bool initialized; + +/* table used to map tunnel metadata to a particular offset within + * flow_tnl.metadata */ +struct tun_meta_table { + struct cmap key_cmap; /* lookup based on key */ + struct cmap ofs_cmap; /* lookup based on offset */ + struct ovs_mutex mutex; /* protect simultaneous writers */ + int next_ofs; /* next offset in flow_tnl.metadata */ + int count; /* number of entries in table */ +}; + +struct tun_meta_entry { + struct cmap_node key_node; /* node in key_cmap */ + struct cmap_node ofs_node; /* node in ofs_cmap */ + uint32_t key; /* unique key */ + uint16_t len; /* len of this metadata */ + uint16_t ofs; /* offset in flow_tnl.metadata */ +}; + +static struct tun_meta_table tun_meta_table; + +static inline uint32_t tun_meta_hash(uint32_t key) +{ + return hash_int(key, 0); +} + +static struct tun_meta_entry * +tun_meta_find_key(const struct tun_meta_table *table, uint32_t key) +{ + struct tun_meta_entry *entry; + const struct cmap *key_cmap = &table->key_cmap; + CMAP_FOR_EACH_WITH_HASH (entry, key_node, tun_meta_hash(key), key_cmap) { + if (entry->key == key) { + return entry; + } + } + return NULL; +} + +static struct tun_meta_entry * +tun_meta_find_ofs(const struct tun_meta_table *table, uint16_t ofs) +{ + struct tun_meta_entry *entry; + const struct cmap *ofs_cmap = &table->ofs_cmap; + CMAP_FOR_EACH_WITH_HASH (entry, ofs_node, tun_meta_hash(ofs), ofs_cmap) { + if (entry->ofs == ofs) { + return entry; + } + } + return NULL; +} + +/* callers responsibility to verify entry->len is same as len. + * returns an entry or null. */ +static struct tun_meta_entry * +tun_meta_add(struct tun_meta_table *table, uint32_t key, uint16_t len) +{ + struct tun_meta_entry *entry = tun_meta_find_key(table, key); + if (entry == NULL) { + ovs_mutex_lock(&table->mutex); + if (table->next_ofs + len <= TUN_METADATA_LEN) { + entry = xmalloc(sizeof *entry); + entry->key = key; + entry->len = len; + entry->ofs = table->next_ofs; + table->next_ofs += len; + cmap_insert(&table->key_cmap, &entry->key_node, + tun_meta_hash(key)); + cmap_insert(&table->ofs_cmap, &entry->ofs_node, + tun_meta_hash(entry->ofs)); + table->count++; + } + ovs_mutex_unlock(&table->mutex); + } + return entry; +} + +static void +tun_meta_remove_all(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) +{ + struct tun_meta_table *table = &tun_meta_table; + struct tun_meta_entry *entry; + + if (!initialized) { + tun_meta_init(); + return; + } + + ovs_mutex_lock(&table->mutex); + CMAP_FOR_EACH(entry, key_node, &table->key_cmap) { + cmap_remove(&table->key_cmap, &entry->key_node, + tun_meta_hash(entry->key)); + cmap_remove(&table->ofs_cmap, &entry->ofs_node, + tun_meta_hash(entry->ofs)); + table->count--; + ovsrcu_postpone(free, entry); + } + ovs_assert(table->count == 0); + table->next_ofs = 0; + ovs_mutex_unlock(&table->mutex); + unixctl_command_reply(conn, "OK"); +} + +static void +tun_meta_print_keys(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) +{ + struct ds ds = DS_EMPTY_INITIALIZER; + const struct tun_meta_table *table = &tun_meta_table; + struct cmap_cursor cursor; + struct tun_meta_entry *entry; + const struct cmap *key_cmap = &table->key_cmap; + ds_put_cstr(&ds," Key Len Offset\n"); + ds_put_cstr(&ds,"==========================\n"); + CMAP_CURSOR_FOR_EACH(entry, key_node, &cursor, key_cmap) { + ds_put_format(&ds, "%8x %8d %8d\n", entry->key, entry->len, entry->ofs); + } + unixctl_command_reply(conn, ds_cstr(&ds)); + ds_destroy(&ds); +} + +void +tun_meta_init(void) +{ + struct tun_meta_table *table = &tun_meta_table; + cmap_init(&table->key_cmap); + cmap_init(&table->ofs_cmap); + ovs_mutex_init(&table->mutex); + table->next_ofs = 0; + table->count = 0; + unixctl_command_register("tnl/meta/show", "", 0, 0, tun_meta_print_keys, NULL); + unixctl_command_register("tnl/meta/flush", "", 0, 0, tun_meta_remove_all, NULL); + initialized = true; +} + +void +tun_meta_destroy(void) +{ + struct tun_meta_table *table = &tun_meta_table; + struct tun_meta_entry *entry; + + if (!initialized) { + return; + } + + ovs_mutex_lock(&table->mutex); + CMAP_FOR_EACH(entry, key_node, &table->key_cmap) { + cmap_remove(&table->key_cmap, &entry->key_node, + tun_meta_hash(entry->key)); + cmap_remove(&table->ofs_cmap, &entry->ofs_node, + tun_meta_hash(entry->ofs)); + table->count--; + ovsrcu_postpone(free, entry); + } + ovs_assert(table->count == 0); + table->next_ofs = 0; + cmap_destroy(&table->key_cmap); + cmap_destroy(&table->ofs_cmap); + ovs_mutex_unlock(&table->mutex); + ovs_mutex_destroy(&table->mutex); + initialized = false; +} + +#define TUN_META_KEY(metadata) \ + ((metadata)[0] << 16 | (metadata)[1] << 8 | (metadata)[2]) + +static struct tun_meta_entry * +find_or_add_tun_meta_entry(const uint8_t metadata[TUN_METADATA_LEN], int len) +{ + uint32_t key = TUN_META_KEY(metadata); + struct tun_meta_entry *e; + + if (!initialized) { + tun_meta_init(); + } + + e = tun_meta_add(&tun_meta_table, key, len); + if (e && (e->len != len)) { + VLOG_ERR("duplicate metadata (key %x, len %d), new len %d", + key, e->len, len); + return NULL; + } else { + return e; + } +} + +static struct tun_meta_entry * +find_tun_meta_entry(const uint8_t metadata[TUN_METADATA_LEN]) +{ + uint32_t key = TUN_META_KEY(metadata); + + if (!initialized) { + tun_meta_init(); + return NULL; + } else { + return tun_meta_find_key(&tun_meta_table, key); + } +} + +static struct tun_meta_entry * +find_tun_meta_ofs(uint16_t ofs) +{ + if (!initialized) { + tun_meta_init(); + return NULL; + } else { + return tun_meta_find_ofs(&tun_meta_table, ofs); + } +} + +bool +tun_metadata_get_lenofs(const uint8_t metadata[TUN_METADATA_LEN], + uint16_t *len, uint16_t *ofs) +{ + const struct tun_meta_entry *e = find_tun_meta_entry(metadata); + if (e) { + *len = e->len; + *ofs = e->ofs; + return true; + } else { + return false; + } +} + +bool +tun_metadata_get_len(uint16_t ofs, uint16_t *len) +{ + const struct tun_meta_entry *e = find_tun_meta_ofs(ofs); + if (e) { + *len = e->len; + return true; + } else { + return false; + } +} + +bool +tun_metadata_valid(const uint8_t metadata[TUN_METADATA_LEN], + uint16_t len, uint16_t ofs) +{ + const struct tun_meta_entry *e = find_tun_meta_entry(metadata); + return (e && e->len == len && e->ofs == ofs) ? true : false; +} + +/* copies a single tun_meta entry at the correct offset in + * metadata and updates the map if needed. */ +void match_set_tun_metadata(struct match *match, + const uint8_t metadata[TUN_METADATA_LEN], + int len) +{ + struct tun_meta_entry *entry = find_or_add_tun_meta_entry(metadata, len); + if (entry) { + uint16_t ofs = entry->ofs; + uint16_t len = entry->len; + memcpy(match->flow.tunnel.metadata + ofs, metadata, len); + memset(match->wc.masks.tunnel.metadata + ofs, 0xff, len); + } +} + +/* copies a single tun_meta entry at the correct offset in + * metadata and updates the map if needed. */ +void match_set_tun_metadata_masked(struct match *match, + const uint8_t metadata[TUN_METADATA_LEN], + const uint8_t mask[TUN_METADATA_LEN], + int len) +{ + struct tun_meta_entry *entry = find_or_add_tun_meta_entry(metadata, len); + if (entry) { + uint16_t ofs = entry->ofs; + uint16_t len = entry->len; + size_t i; + for (i = 0; i < len; i++) { + match->flow.tunnel.metadata[i + ofs] = metadata[i] & mask[i]; + match->wc.masks.tunnel.metadata[i + ofs] = mask[i]; + } + } +} + +/* xlate actions use this api to get an offset into flow.tun_metadata */ +int action_find_or_add_tun_meta_entry(const uint8_t metadata[TUN_METADATA_LEN], + int len) +{ + struct tun_meta_entry *entry = find_or_add_tun_meta_entry(metadata, len); + if (entry) { + return entry->ofs; + } else { + return -1; + } +} + +int +geneve_nlattr_to_tun_metadata(const struct nlattr *attr, + uint8_t metadata[TUN_METADATA_LEN]) +{ + int opts_len = nl_attr_get_size(attr); + const struct geneve_opt *opt = nl_attr_get(attr); + + while (opts_len > 0) { + int len; + struct tun_meta_entry *entry; + + if (opts_len < sizeof(*opt)) { + return -EINVAL; + } + + len = sizeof(*opt) + opt->length * 4; + if (len > opts_len) { + return -EINVAL; + } + + entry = find_tun_meta_entry((uint8_t *)opt); + if (entry && entry->len == (len - 1)) { + memcpy(metadata + entry->ofs, opt, 3); + memcpy(metadata + entry->ofs + 3, opt->opt_data, entry->len - 3); + } else if (opt->type & GENEVE_CRIT_OPT_TYPE) { + return -EINVAL; + } + + opt = opt + len / sizeof(*opt); + opts_len -= len; + }; + + return 0; +} + +int tun_metadata_to_geneve_nlattr(const uint8_t metadata[TUN_METADATA_LEN], + uint8_t nlattr[TUN_METADATA_LEN]) +{ + uint16_t len, mofs = 0, nofs = 0; + struct geneve_opt *opt; + + while (tun_metadata_get_len(mofs, &len)) { + if (tun_metadata_valid(metadata + mofs, len, mofs)) { + if ((len - 3) % 4) { + mofs += len; + ovs_assert(mofs < TUN_METADATA_LEN); + continue; + } + opt = (struct geneve_opt *)(nlattr + nofs); + memcpy(nlattr + nofs, metadata + mofs, 3); + opt->length = (len - 3) / 4; + opt->r1 = 0; + opt->r2 = 0; + opt->r3 = 0; + memcpy(opt->opt_data, metadata + mofs + 3, len - 3); + nofs += (len + 1); + } + mofs += len; + ovs_assert(mofs < TUN_METADATA_LEN); + } + return nofs; +} diff --git a/lib/tun-metadata.h b/lib/tun-metadata.h new file mode 100644 index 0000000..4a58232 --- /dev/null +++ b/lib/tun-metadata.h @@ -0,0 +1,44 @@ +/* + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef TUN_METADATA_H +#define TUN_METADATA_H + +#include "netlink.h" +#include "match.h" +#include "ofp-actions.h" + +void tun_meta_init(void); +void tun_meta_destroy(void); +bool tun_metadata_get_lenofs(const uint8_t metadata[TUN_METADATA_LEN], + uint16_t *len, uint16_t *ofs); +bool tun_metadata_get_len(uint16_t ofs, uint16_t *len); +bool tun_metadata_valid(const uint8_t metadata[TUN_METADATA_LEN], + uint16_t len, uint16_t ofs); +void match_set_tun_metadata(struct match *match, + const uint8_t metadata[TUN_METADATA_LEN], + int len); +void match_set_tun_metadata_masked(struct match *match, + const uint8_t metadata[TUN_METADATA_LEN], + const uint8_t mask[TUN_METADATA_LEN], + int len); +int action_find_or_add_tun_meta_entry(const uint8_t metadata[TUN_METADATA_LEN], + int len); +int geneve_nlattr_to_tun_metadata(const struct nlattr *attr, + uint8_t metadata[TUN_METADATA_LEN]); +int tun_metadata_to_geneve_nlattr(const uint8_t metadata[TUN_METADATA_LEN], + uint8_t nlattr[TUN_METADATA_LEN]); +#endif /* tun-metadata.h */ diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 0786513..5eb8f74 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -56,6 +56,7 @@ #include "ovs-router.h" #include "tnl-ports.h" #include "tunnel.h" +#include "tun-metadata.h" #include "openvswitch/vlog.h" COVERAGE_DEFINE(xlate_actions); @@ -2635,7 +2636,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, /* If 'struct flow' gets additional metadata, we'll need to zero it out * before traversing a patch port. */ - BUILD_ASSERT_DECL(FLOW_WC_SEQ == 30); + BUILD_ASSERT_DECL(FLOW_WC_SEQ == 31); memset(&flow_tnl, 0, sizeof flow_tnl); if (!xport) { @@ -3751,6 +3752,7 @@ ofpact_needs_recirculation_after_mpls(const struct ofpact *a, struct xlate_ctx * case OFPACT_WRITE_ACTIONS: case OFPACT_CLEAR_ACTIONS: case OFPACT_SAMPLE: + case OFPACT_TUN_METADATA: return false; case OFPACT_POP_MPLS: @@ -4113,6 +4115,17 @@ do_xlate_actions(const struct ofpact *ofpacts, size_t ofpacts_len, case OFPACT_SAMPLE: xlate_sample_action(ctx, ofpact_get_SAMPLE(a)); break; + + case OFPACT_TUN_METADATA: { + struct ofpact_tun_metadata *otm = ofpact_get_TUN_METADATA(a); + int ofs = action_find_or_add_tun_meta_entry(otm->data, + otm->data_len); + if (ofs != -1) { + memcpy(flow->tunnel.metadata + ofs, otm->data, otm->data_len); + memset(wc->masks.tunnel.metadata + ofs, 0xff, otm->data_len); + } + break; + } } } } diff --git a/tests/ofproto.at b/tests/ofproto.at index 2a2111f..d47825b 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -1480,7 +1480,7 @@ OVS_VSWITCHD_START instructions: meter,apply_actions,clear_actions,write_actions,write_metadata$goto Write-Actions and Apply-Actions features: actions: output group set_field strip_vlan push_vlan mod_nw_ttl dec_ttl set_mpls_ttl dec_mpls_ttl push_mpls pop_mpls set_queue - supported on Set-Field: tun_id tun_src tun_dst metadata in_port in_port_oxm pkt_mark reg0 reg1 reg2 reg3 reg4 reg5 reg6 reg7 xreg0 xreg1 xreg2 xreg3 eth_src eth_dst vlan_tci vlan_vid vlan_pcp mpls_label mpls_tc ip_src ip_dst ipv6_src ipv6_dst ipv6_label nw_tos ip_dscp nw_ecn nw_ttl arp_op arp_spa arp_tpa arp_sha arp_tha tcp_src tcp_dst udp_src udp_dst sctp_src sctp_dst nd_target nd_sll nd_tll + supported on Set-Field: tun_id tun_src tun_dst metadata in_port in_port_oxm pkt_mark reg0 reg1 reg2 reg3 reg4 reg5 reg6 reg7 xreg0 xreg1 xreg2 xreg3 eth_src eth_dst vlan_tci vlan_vid vlan_pcp mpls_label mpls_tc ip_src ip_dst ipv6_src ipv6_dst ipv6_label nw_tos ip_dscp nw_ecn nw_ttl arp_op arp_spa arp_tpa arp_sha arp_tha tcp_src tcp_dst udp_src udp_dst sctp_src sctp_dst nd_target nd_sll nd_tll tun_metadata matching: dp_hash: arbitrary mask recirc_id: exact match or wildcard @@ -1543,7 +1543,8 @@ OVS_VSWITCHD_START icmpv6_code: exact match or wildcard nd_target: arbitrary mask nd_sll: arbitrary mask - nd_tll: arbitrary mask" + nd_tll: arbitrary mask + tun_metadata: arbitrary mask" x=$y name=table$x done) > expout @@ -1562,7 +1563,7 @@ AT_CHECK( # Check that the configuration was updated. mv expout orig-expout sed 's/classifier/main/ -75s/1000000/1024/' < orig-expout > expout +76s/1000000/1024/' < orig-expout > expout AT_CHECK([ovs-ofctl -O OpenFlow13 dump-table-features br0 | sed '/^$/d /^OFPST_TABLE_FEATURES/d'], [0], [expout]) OVS_VSWITCHD_STOP -- 1.7.9.5 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev