OpenFlow 1.4 introduces the ability to turn on flow table eviction with an OFPT_TABLE_MOD message specifying OFPTC_EVICTION. It also adds related machinery to other messages that mention OFPTC_* fields. This commit adds support for the new feature, implementing it as a second, parallel way to enable flow table eviction. It takes more work than it seems like it should because there is so much weirdness with the treatment of OFPTC_* flags over the evolution of OpenFlow; please refer to the explanation in DESIGN.md for more information.
This commit also adds related support to ovs-ofctl, plus tests. Signed-off-by: Ben Pfaff <b...@nicira.com> Co-authored-by: Saloni Jain <saloni.j...@tcs.com> Signed-off-by: Saloni Jain <saloni.j...@tcs.com> --- DESIGN.md | 85 +++++++++++++- NEWS | 1 + include/openflow/openflow-1.3.h | 9 +- lib/ofp-parse.c | 52 ++++++--- lib/ofp-parse.h | 2 +- lib/ofp-print.c | 90 ++++++++++---- lib/ofp-util.c | 220 ++++++++++++++++++++++++++--------- lib/ofp-util.h | 52 ++++++++- ofproto/ofproto-provider.h | 9 +- ofproto/ofproto.c | 164 ++++++++++++++++---------- ofproto/ofproto.h | 18 +-- tests/ofp-print.at | 2 +- tests/ofproto.at | 23 +--- utilities/ovs-ofctl.8.in | 29 ++--- utilities/ovs-ofctl.c | 34 +++--- vswitchd/bridge.c | 7 +- vswitchd/vswitch.xml | 252 ++++++++++++++++++++++------------------ 17 files changed, 704 insertions(+), 345 deletions(-) diff --git a/DESIGN.md b/DESIGN.md index e533b7c..38413d7 100644 --- a/DESIGN.md +++ b/DESIGN.md @@ -277,13 +277,19 @@ The table for 1.3 is the same as the one shown above for 1.2. OpenFlow 1.4 ------------- +----------- + +OpenFlow 1.4 makes these changes: + + - Adds the "importance" field to flow_mods, but it does not + explicitly specify which kinds of flow_mods set the importance. + For consistency, Open vSwitch uses the same rule for importance + as for idle_timeout and hard_timeout, that is, only an "ADD" + flow_mod sets the importance. (This issue has been filed with + the ONF as EXT-496.) -OpenFlow 1.4 adds the "importance" field to flow_mods, but it does not -explicitly specify which kinds of flow_mods set the importance. For -consistency, Open vSwitch uses the same rule for importance as for -idle_timeout and hard_timeout, that is, only an "ADD" flow_mod sets -the importance. (This issue has been filed with the ONF as EXT-496.) + - Eviction Mechanism to automatically delete entries of lower + importance to make space for newer entries. OpenFlow 1.4 Bundles @@ -606,6 +612,73 @@ Tables 128 and above are reserved for use by the switch itself. Controllers should use only tables 0 through 127. +OFPTC_* Table Configuration +=========================== + +This section covers the history of the OFPTC_* table configuration +bits across OpenFlow versions. + +OpenFlow 1.0 flow tables had fixed configurations. + +OpenFlow 1.1 enabled controllers to configure behavior upon flow table +miss and added the OFPTC_MISS_* constants for that purpose. OFPTC_* +did not control anything else but it was nevertheless conceptualized +as a set of bit-fields instead of an enum. OF1.1 added the +OFPT_TABLE_MOD message to set OFPTC_MISS_* for a flow table and added +the 'config' field to the OFPST_TABLE reply to report the current +setting. + +OpenFlow 1.2 did not change anything in this regard. + +OpenFlow 1.3 switched to another means to changing flow table miss +behavior and deprecated OFPTC_MISS_* without adding any more OFPTC_* +constants. This meant that OFPT_TABLE_MOD now had no purpose at all, +but OF1.3 kept it around "for backward compatibility with older and +newer versions of the specification." At the same time, OF1.3 +introduced a new message OFPMP_TABLE_FEATURES that included a field +'config' documented as reporting the OFPTC_* values set with +OFPT_TABLE_MOD; of course this served no real purpose because no +OFPTC_* values are defined. OF1.3 did remove the OFPTC_* field from +OFPMP_TABLE (previously named OFPST_TABLE). + +OpenFlow 1.4 defined two new OFPTC_* constants, OFPTC_EVICTION and +OFPTC_VACANCY_EVENTS, using bits that did not overlap with +OFPTC_MISS_* even though those bits had not been defined since OF1.2. +OFPT_TABLE_MOD still controlled these settings. The field for OFPTC_* +values in OFPMP_TABLE_FEATURES was renamed from 'config' to +'capabilities' and documented as reporting the flags that are +supported in a OFPT_TABLE_MOD message. The OFPMP_TABLE_DESC message +newly added in OF1.4 reported the OFPTC_* setting. + +OpenFlow 1.5 did not change anything in this regard. + +The following table summarizes. The columns say: + + - OpenFlow version(s). + + - The OFPTC_* flags defined in those versions. + + - Whether OFPT_TABLE_MOD can modify OFPTC_* flags. + + - Whether OFPST_TABLE/OFPMP_TABLE reports the OFPTC_* flags. + + - What OFPMP_TABLE_FEATURES reports (if it exists): either the + current configuration or the switch's capabilities. + + - Whether OFPMP_TABLE_DESC reports the current configuration. + +OpenFlow OFPTC_* flags TABLE_MOD stats? TABLE_FEATURES TABLE_DESC +--------- ----------------------- --------- ------ -------------- ---------- +OF1.0 none no[*][+] no[*] nothing[*][+] no[*][+] +OF1.1/1.2 MISS_* yes yes nothing[+] no[+] +OF1.3 none yes[*] no[*] config[*] no[*][+] +OF1.4/1.5 EVICTION/VACANCY_EVENTS yes no capabilities yes + + [*] Nothing to report/change anyway. + + [+] No such message. + + IPv6 ==== diff --git a/NEWS b/NEWS index 363bd8c..8dccdcd 100644 --- a/NEWS +++ b/NEWS @@ -7,6 +7,7 @@ Post-v2.4.0 * Group chaining (where one OpenFlow group triggers another) is now supported. * OpenFlow 1.4+ "importance" is now considered for flow eviction. + * OpenFlow 1.4+ OFPTC_EVICTION is now implemented. v2.4.0 - xx xxx xxxx diff --git a/include/openflow/openflow-1.3.h b/include/openflow/openflow-1.3.h index 142d32c..cf93429 100644 --- a/include/openflow/openflow-1.3.h +++ b/include/openflow/openflow-1.3.h @@ -232,7 +232,14 @@ struct ofp13_table_features { char name[OFP_MAX_TABLE_NAME_LEN]; ovs_be64 metadata_match; /* Bits of metadata table can match. */ ovs_be64 metadata_write; /* Bits of metadata table can write. */ - ovs_be32 config; /* Bitmap of OFPTC_* values */ + + /* In OF1.3 this field was named 'config' and it was useless because OF1.3 + * did not define any OFPTC_* bits. + * + * OF1.4 renamed this field to 'capabilities' and added OFPTC14_EVICTION + * and OFPTC14_VACANCY_EVENTS. */ + ovs_be32 capabilities; /* Bitmap of OFPTC_* values */ + ovs_be32 max_entries; /* Max number of entries supported. */ /* Table Feature Property list */ diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c index 210feed..e2a06b1 100644 --- a/lib/ofp-parse.c +++ b/lib/ofp-parse.c @@ -869,20 +869,20 @@ parse_ofp_flow_mod_str(struct ofputil_flow_mod *fm, const char *string, return error; } -/* Convert 'table_id' and 'flow_miss_handling' (as described for the - * "mod-table" command in the ovs-ofctl man page) into 'tm' for sending the - * specified table_mod 'command' to a switch. +/* Convert 'table_id' and 'setting' (as described for the "mod-table" command + * in the ovs-ofctl man page) into 'tm' for sending a table_mod command to a + * switch. + * + * Stores a bitmap of the OpenFlow versions that are usable for 'tm' into + * '*usable_versions'. * * Returns NULL if successful, otherwise a malloc()'d string describing the * error. The caller is responsible for freeing the returned string. */ char * OVS_WARN_UNUSED_RESULT parse_ofp_table_mod(struct ofputil_table_mod *tm, const char *table_id, - const char *flow_miss_handling, - enum ofputil_protocol *usable_protocols) + const char *setting, uint32_t *usable_versions) { - /* Table mod requires at least OF 1.1. */ - *usable_protocols = OFPUTIL_P_OF11_UP; - + *usable_versions = 0; if (!strcasecmp(table_id, "all")) { tm->table_id = OFPTT_ALL; } else { @@ -892,18 +892,38 @@ parse_ofp_table_mod(struct ofputil_table_mod *tm, const char *table_id, } } - if (strcmp(flow_miss_handling, "controller") == 0) { - tm->miss_config = OFPUTIL_TABLE_MISS_CONTROLLER; - } else if (strcmp(flow_miss_handling, "continue") == 0) { - tm->miss_config = OFPUTIL_TABLE_MISS_CONTINUE; - } else if (strcmp(flow_miss_handling, "drop") == 0) { - tm->miss_config = OFPUTIL_TABLE_MISS_DROP; + tm->miss = OFPUTIL_TABLE_MISS_DEFAULT; + tm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT; + tm->eviction_flags = UINT32_MAX; + + /* Only OpenFlow 1.1 and 1.2 can configure table-miss via table_mod. + * Only OpenFlow 1.4+ can configure eviction via table_mod. + * + * (OpenFlow 1.4+ can also configure vacancy events via table_mod, but OVS + * doesn't support those yet and they're also logically a per-OpenFlow + * session setting so it wouldn't make sense to support them here anyway.) + */ + if (!strcmp(setting, "controller")) { + tm->miss = OFPUTIL_TABLE_MISS_CONTROLLER; + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); + } else if (!strcmp(setting, "continue")) { + tm->miss = OFPUTIL_TABLE_MISS_CONTINUE; + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); + } else if (!strcmp(setting, "drop")) { + tm->miss = OFPUTIL_TABLE_MISS_DROP; + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); + } else if (!strcmp(setting, "evict")) { + tm->eviction = OFPUTIL_TABLE_EVICTION_ON; + *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION); + } else if (!strcmp(setting, "noevict")) { + tm->eviction = OFPUTIL_TABLE_EVICTION_OFF; + *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION); } else { - return xasprintf("invalid flow_miss_handling %s", flow_miss_handling); + return xasprintf("invalid table_mod setting %s", setting); } if (tm->table_id == 0xfe - && tm->miss_config == OFPUTIL_TABLE_MISS_CONTINUE) { + && tm->miss == OFPUTIL_TABLE_MISS_CONTINUE) { return xstrdup("last table's flow miss handling can not be continue"); } diff --git a/lib/ofp-parse.h b/lib/ofp-parse.h index f112603..2eb9067 100644 --- a/lib/ofp-parse.h +++ b/lib/ofp-parse.h @@ -48,7 +48,7 @@ char *parse_ofp_flow_mod_str(struct ofputil_flow_mod *, const char *string, char *parse_ofp_table_mod(struct ofputil_table_mod *, const char *table_id, const char *flow_miss_handling, - enum ofputil_protocol *usable_protocols) + uint32_t *usable_versions) OVS_WARN_UNUSED_RESULT; char *parse_ofp_flow_mod_file(const char *file_name, int command, diff --git a/lib/ofp-print.c b/lib/ofp-print.c index 2ac11b1..1a6a9d8 100644 --- a/lib/ofp-print.c +++ b/lib/ofp-print.c @@ -942,23 +942,54 @@ ofp_print_port_mod(struct ds *string, const struct ofp_header *oh) } } -static void -ofp_print_table_miss_config(struct ds *string, enum ofputil_table_miss miss) +static const char * +ofputil_table_miss_to_string(enum ofputil_table_miss miss) { switch (miss) { - case OFPUTIL_TABLE_MISS_CONTROLLER: - ds_put_cstr(string, "controller\n"); - break; - case OFPUTIL_TABLE_MISS_CONTINUE: - ds_put_cstr(string, "continue\n"); - break; - case OFPUTIL_TABLE_MISS_DROP: - ds_put_cstr(string, "drop\n"); - break; - case OFPUTIL_TABLE_MISS_DEFAULT: - default: - ds_put_format(string, "Unknown (%d)\n", miss); - break; + case OFPUTIL_TABLE_MISS_DEFAULT: return "default"; + case OFPUTIL_TABLE_MISS_CONTROLLER: return "controller"; + case OFPUTIL_TABLE_MISS_CONTINUE: return "continue"; + case OFPUTIL_TABLE_MISS_DROP: return "drop"; + default: return "***error***"; + } +} + +static const char * +ofputil_table_eviction_to_string(enum ofputil_table_eviction eviction) +{ + switch (eviction) { + case OFPUTIL_TABLE_EVICTION_DEFAULT: return "default"; + case OFPUTIL_TABLE_EVICTION_ON: return "on"; + case OFPUTIL_TABLE_EVICTION_OFF: return "off"; + default: return "***error***"; + } + +} + +static const char * +ofputil_eviction_flag_to_string(uint32_t bit) +{ + enum ofp14_table_mod_prop_eviction_flag eviction_flag = bit; + + switch (eviction_flag) { + case OFPTMPEF14_OTHER: return "OTHER"; + case OFPTMPEF14_IMPORTANCE: return "IMPORTANCE"; + case OFPTMPEF14_LIFETIME: return "LIFETIME"; + } + + return NULL; +} + +/* Appends to 'string' a description of the bitmap of OFPTMPEF14_* values in + * 'eviction_flags'. */ +static void +ofputil_put_eviction_flags(struct ds *string, uint32_t eviction_flags) +{ + if (eviction_flags != UINT32_MAX) { + ofp_print_bit_names(string, eviction_flags, + ofputil_eviction_flag_to_string, '|'); + } else { + ds_put_cstr(string, "(default)"); } } @@ -980,9 +1011,17 @@ ofp_print_table_mod(struct ds *string, const struct ofp_header *oh) ds_put_format(string, " table_id=%"PRIu8, pm.table_id); } - if (pm.miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { - ds_put_cstr(string, ", flow_miss_config="); - ofp_print_table_miss_config(string, pm.miss_config); + if (pm.miss != OFPUTIL_TABLE_MISS_DEFAULT) { + ds_put_format(string, ", flow_miss_config=%s", + ofputil_table_miss_to_string(pm.miss)); + } + if (pm.eviction != OFPUTIL_TABLE_EVICTION_DEFAULT) { + ds_put_format(string, ", eviction=%s", + ofputil_table_eviction_to_string(pm.eviction)); + } + if (pm.eviction_flags != UINT32_MAX) { + ds_put_cstr(string, "eviction_flags="); + ofputil_put_eviction_flags(string, pm.eviction_flags); } } @@ -2500,8 +2539,19 @@ ofp_print_table_features(struct ds *s, } if (features->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { - ds_put_cstr(s, " config="); - ofp_print_table_miss_config(s, features->miss_config); + ds_put_format(s, " config=%s\n", + ofputil_table_miss_to_string(features->miss_config)); + } + + if (features->supports_eviction >= 0) { + ds_put_format(s, " eviction: %ssupported\n", + features->supports_eviction ? "" : "not "); + + } + if (features->supports_vacancy_events >= 0) { + ds_put_format(s, " vacancy events: %ssupported\n", + features->supports_vacancy_events ? "" : "not "); + } if (features->max_entries) { diff --git a/lib/ofp-util.c b/lib/ofp-util.c index c1b2394..8753a5a 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -52,8 +52,11 @@ VLOG_DEFINE_THIS_MODULE(ofp_util); * in the peer and so there's not much point in showing a lot of them. */ static struct vlog_rate_limit bad_ofmsg_rl = VLOG_RATE_LIMIT_INIT(1, 5); -static enum ofputil_table_miss ofputil_table_miss_from_config( - ovs_be32 config_, enum ofp_version); +static enum ofputil_table_eviction ofputil_decode_table_eviction( + ovs_be32 config, enum ofp_version); +static ovs_be32 ofputil_encode_table_config(enum ofputil_table_miss, + enum ofputil_table_eviction, + enum ofp_version); struct ofp_prop_header { ovs_be16 type; @@ -4643,7 +4646,15 @@ ofputil_decode_table_features(struct ofpbuf *msg, ovs_strlcpy(tf->name, otf->name, OFP_MAX_TABLE_NAME_LEN); tf->metadata_match = otf->metadata_match; tf->metadata_write = otf->metadata_write; - tf->miss_config = ofputil_table_miss_from_config(otf->config, oh->version); + tf->miss_config = OFPUTIL_TABLE_MISS_DEFAULT; + if (oh->version >= OFP14_VERSION) { + uint32_t caps = ntohl(otf->capabilities); + tf->supports_eviction = (caps & OFPTC14_EVICTION) != 0; + tf->supports_vacancy_events = (caps & OFPTC14_VACANCY_EVENTS) != 0; + } else { + tf->supports_eviction = -1; + tf->supports_vacancy_events = -1; + } tf->max_entries = ntohl(otf->max_entries); while (properties.size > 0) { @@ -4851,7 +4862,14 @@ ofputil_append_table_features_reply(const struct ofputil_table_features *tf, ovs_strlcpy(otf->name, tf->name, sizeof otf->name); otf->metadata_match = tf->metadata_match; otf->metadata_write = tf->metadata_write; - otf->config = ofputil_table_miss_to_config(tf->miss_config, version); + if (version >= OFP14_VERSION) { + if (tf->supports_eviction) { + otf->capabilities |= htonl(OFPTC14_EVICTION); + } + if (tf->supports_vacancy_events) { + otf->capabilities |= htonl(OFPTC14_VACANCY_EVENTS); + } + } otf->max_entries = htonl(tf->max_entries); put_table_instruction_features(reply, &tf->nonmiss, 0, version); @@ -4867,17 +4885,97 @@ ofputil_append_table_features_reply(const struct ofputil_table_features *tf, ofpmp_postappend(replies, start_ofs); } -/* ofputil_table_mod */ +static enum ofperr +parse_table_mod_eviction_property(struct ofpbuf *property, + struct ofputil_table_mod *tm) +{ + struct ofp14_table_mod_prop_eviction *ote = property->data; + + if (property->size != sizeof *ote) { + return OFPERR_OFPBPC_BAD_LEN; + } + + tm->eviction_flags = ntohl(ote->flags); + return 0; +} + +/* Given 'config', taken from an OpenFlow 'version' message that specifies + * table configuration (a table mod, table stats, or table features message), + * returns the table eviction configuration that it specifies. + * + * Only OpenFlow 1.4 and later specify table eviction configuration this way, + * so for other 'version' values this function always returns + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ +static enum ofputil_table_eviction +ofputil_decode_table_eviction(ovs_be32 config, enum ofp_version version) +{ + return (version < OFP14_VERSION ? OFPUTIL_TABLE_EVICTION_DEFAULT + : config & htonl(OFPTC14_EVICTION) ? OFPUTIL_TABLE_EVICTION_ON + : OFPUTIL_TABLE_EVICTION_OFF); +} + +/* Returns a bitmap of OFPTC* values suitable for 'config' fields in various + * OpenFlow messages of the given 'version', based on the provided 'miss' and + * 'eviction' values. */ +static ovs_be32 +ofputil_encode_table_config(enum ofputil_table_miss miss, + enum ofputil_table_eviction eviction, + enum ofp_version version) +{ + /* See the section "OFPTC_* Table Configuration" in DESIGN.md for more + * information on the crazy evolution of this field. */ + switch (version) { + case OFP10_VERSION: + /* OpenFlow 1.0 didn't have such a field, any value ought to do. */ + return htonl(0); + + case OFP11_VERSION: + case OFP12_VERSION: + /* OpenFlow 1.1 and 1.2 define only OFPTC11_TABLE_MISS_*. */ + switch (miss) { + case OFPUTIL_TABLE_MISS_DEFAULT: + /* Really this shouldn't be used for encoding (the caller should + * provide a specific value) but I can't imagine that defaulting to + * the fall-through case here will hurt. */ + case OFPUTIL_TABLE_MISS_CONTROLLER: + default: + return htonl(OFPTC11_TABLE_MISS_CONTROLLER); + case OFPUTIL_TABLE_MISS_CONTINUE: + return htonl(OFPTC11_TABLE_MISS_CONTINUE); + case OFPUTIL_TABLE_MISS_DROP: + return htonl(OFPTC11_TABLE_MISS_DROP); + } + OVS_NOT_REACHED(); + + case OFP13_VERSION: + /* OpenFlow 1.3 removed OFPTC11_TABLE_MISS_* and didn't define any new + * flags, so this is correct. */ + return htonl(0); + + case OFP14_VERSION: + case OFP15_VERSION: + /* OpenFlow 1.4 introduced OFPTC14_EVICTION and OFPTC14_VACANCY_EVENTS + * and we don't support the latter yet. */ + return htonl(eviction == OFPUTIL_TABLE_EVICTION_ON + ? OFPTC14_EVICTION : 0); + } + + OVS_NOT_REACHED(); +} /* Given 'config', taken from an OpenFlow 'version' message that specifies * table configuration (a table mod, table stats, or table features message), - * returns the table miss configuration that it specifies. */ + * returns the table miss configuration that it specifies. + * + * Only OpenFlow 1.1 and 1.2 specify table miss configurations this way, so for + * other 'version' values this function always returns + * OFPUTIL_TABLE_MISS_DEFAULT. */ static enum ofputil_table_miss -ofputil_table_miss_from_config(ovs_be32 config_, enum ofp_version version) +ofputil_decode_table_miss(ovs_be32 config_, enum ofp_version version) { uint32_t config = ntohl(config_); - if (version < OFP13_VERSION) { + if (version == OFP11_VERSION || version == OFP12_VERSION) { switch (config & OFPTC11_TABLE_MISS_MASK) { case OFPTC11_TABLE_MISS_CONTROLLER: return OFPUTIL_TABLE_MISS_CONTROLLER; @@ -4897,32 +4995,6 @@ ofputil_table_miss_from_config(ovs_be32 config_, enum ofp_version version) } } -/* Given a table miss configuration, returns the corresponding OpenFlow table - * configuration for use in an OpenFlow message of the given 'version'. */ -ovs_be32 -ofputil_table_miss_to_config(enum ofputil_table_miss miss, - enum ofp_version version) -{ - if (version < OFP13_VERSION) { - switch (miss) { - case OFPUTIL_TABLE_MISS_CONTROLLER: - case OFPUTIL_TABLE_MISS_DEFAULT: - return htonl(OFPTC11_TABLE_MISS_CONTROLLER); - - case OFPUTIL_TABLE_MISS_CONTINUE: - return htonl(OFPTC11_TABLE_MISS_CONTINUE); - - case OFPUTIL_TABLE_MISS_DROP: - return htonl(OFPTC11_TABLE_MISS_DROP); - - default: - OVS_NOT_REACHED(); - } - } else { - return htonl(0); - } -} - /* Decodes the OpenFlow "table mod" message in '*oh' into an abstract form in * '*pm'. Returns 0 if successful, otherwise an OFPERR_* value. */ enum ofperr @@ -4932,6 +5004,10 @@ ofputil_decode_table_mod(const struct ofp_header *oh, enum ofpraw raw; struct ofpbuf b; + memset(pm, 0, sizeof *pm); + pm->miss = OFPUTIL_TABLE_MISS_DEFAULT; + pm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT; + pm->eviction_flags = UINT32_MAX; ofpbuf_use_const(&b, oh, ntohs(oh->length)); raw = ofpraw_pull_assert(&b); @@ -4939,16 +5015,37 @@ ofputil_decode_table_mod(const struct ofp_header *oh, const struct ofp11_table_mod *otm = b.data; pm->table_id = otm->table_id; - pm->miss_config = ofputil_table_miss_from_config(otm->config, - oh->version); + pm->miss = ofputil_decode_table_miss(otm->config, oh->version); } else if (raw == OFPRAW_OFPT14_TABLE_MOD) { const struct ofp14_table_mod *otm = ofpbuf_pull(&b, sizeof *otm); pm->table_id = otm->table_id; - pm->miss_config = ofputil_table_miss_from_config(otm->config, - oh->version); - /* We do not understand any properties yet, so we do not bother - * parsing them. */ + pm->miss = ofputil_decode_table_miss(otm->config, oh->version); + pm->eviction = ofputil_decode_table_eviction(otm->config, oh->version); + while (b.size > 0) { + struct ofpbuf property; + enum ofperr error; + uint16_t type; + + error = ofputil_pull_property(&b, &property, &type); + if (error) { + return error; + } + + switch (type) { + case OFPTMPT14_EVICTION: + error = parse_table_mod_eviction_property(&property, pm); + break; + + default: + error = OFPERR_OFPBRC_BAD_TYPE; + break; + } + + if (error) { + return error; + } + } } else { return OFPERR_OFPBRC_BAD_TYPE; } @@ -4956,11 +5053,11 @@ ofputil_decode_table_mod(const struct ofp_header *oh, return 0; } -/* Converts the abstract form of a "table mod" message in '*pm' into an OpenFlow - * message suitable for 'protocol', and returns that encoded form in a buffer - * owned by the caller. */ +/* Converts the abstract form of a "table mod" message in '*tm' into an + * OpenFlow message suitable for 'protocol', and returns that encoded form in a + * buffer owned by the caller. */ struct ofpbuf * -ofputil_encode_table_mod(const struct ofputil_table_mod *pm, +ofputil_encode_table_mod(const struct ofputil_table_mod *tm, enum ofputil_protocol protocol) { enum ofp_version ofp_version = ofputil_protocol_to_ofp_version(protocol); @@ -4979,20 +5076,28 @@ ofputil_encode_table_mod(const struct ofputil_table_mod *pm, b = ofpraw_alloc(OFPRAW_OFPT11_TABLE_MOD, ofp_version, 0); otm = ofpbuf_put_zeros(b, sizeof *otm); - otm->table_id = pm->table_id; - otm->config = ofputil_table_miss_to_config(pm->miss_config, - ofp_version); + otm->table_id = tm->table_id; + otm->config = ofputil_encode_table_config(tm->miss, tm->eviction, + ofp_version); break; } case OFP14_VERSION: case OFP15_VERSION: { struct ofp14_table_mod *otm; + struct ofp14_table_mod_prop_eviction *ote; b = ofpraw_alloc(OFPRAW_OFPT14_TABLE_MOD, ofp_version, 0); otm = ofpbuf_put_zeros(b, sizeof *otm); - otm->table_id = pm->table_id; - otm->config = ofputil_table_miss_to_config(pm->miss_config, - ofp_version); + otm->table_id = tm->table_id; + otm->config = ofputil_encode_table_config(tm->miss, tm->eviction, + ofp_version); + + if (tm->eviction_flags != UINT32_MAX) { + ote = ofpbuf_put_zeros(b, sizeof *ote); + ote->type = htons(OFPTMPT14_EVICTION); + ote->length = htons(sizeof *ote); + ote->flags = htonl(tm->eviction_flags); + } break; } default: @@ -5338,8 +5443,9 @@ ofputil_put_ofp12_table_stats(const struct ofputil_table_stats *stats, out->metadata_write = features->metadata_write; out->instructions = ovsinst_bitmap_to_openflow( features->nonmiss.instructions, OFP12_VERSION); - out->config = ofputil_table_miss_to_config(features->miss_config, - OFP12_VERSION); + out->config = ofputil_encode_table_config(features->miss_config, + OFPUTIL_TABLE_EVICTION_DEFAULT, + OFP12_VERSION); out->max_entries = htonl(features->max_entries); out->active_count = htonl(stats->active_count); out->lookup_count = htonll(stats->lookup_count); @@ -5445,8 +5551,8 @@ ofputil_decode_ofp11_table_stats(struct ofpbuf *msg, features->nonmiss.apply.ofpacts = ofpact_bitmap_from_openflow( ots->write_actions, OFP11_VERSION); features->miss = features->nonmiss; - features->miss_config = ofputil_table_miss_from_config(ots->config, - OFP11_VERSION); + features->miss_config = ofputil_decode_table_miss(ots->config, + OFP11_VERSION); features->match = mf_bitmap_from_of11(ots->match); features->wildcard = mf_bitmap_from_of11(ots->wildcards); bitmap_or(features->match.bm, features->wildcard.bm, MFF_N_IDS); @@ -5475,8 +5581,8 @@ ofputil_decode_ofp12_table_stats(struct ofpbuf *msg, ovs_strlcpy(features->name, ots->name, sizeof features->name); features->metadata_match = ots->metadata_match; features->metadata_write = ots->metadata_write; - features->miss_config = ofputil_table_miss_from_config(ots->config, - OFP12_VERSION); + features->miss_config = ofputil_decode_table_miss(ots->config, + OFP12_VERSION); features->max_entries = ntohl(ots->max_entries); features->nonmiss.instructions = ovsinst_bitmap_from_openflow( @@ -5544,6 +5650,8 @@ ofputil_decode_table_stats_reply(struct ofpbuf *msg, memset(stats, 0, sizeof *stats); memset(features, 0, sizeof *features); + features->supports_eviction = -1; + features->supports_vacancy_events = -1; switch ((enum ofp_version) oh->version) { case OFP10_VERSION: diff --git a/lib/ofp-util.h b/lib/ofp-util.h index 596c2e2..f3f42d3 100644 --- a/lib/ofp-util.h +++ b/lib/ofp-util.h @@ -609,13 +609,33 @@ enum ofputil_table_miss { OFPUTIL_TABLE_MISS_DROP, /* Drop the packet. */ }; -ovs_be32 ofputil_table_miss_to_config(enum ofputil_table_miss, - enum ofp_version); +/* Abstract version of OFPTC14_EVICTION. + * + * OpenFlow 1.0 through 1.3 don't know anything about eviction, so decoding a + * message for one of these protocols always yields + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ +enum ofputil_table_eviction { + OFPUTIL_TABLE_EVICTION_DEFAULT, /* No value. */ + OFPUTIL_TABLE_EVICTION_ON, /* Enable eviction. */ + OFPUTIL_TABLE_EVICTION_OFF /* Disable eviction. */ +}; /* Abstract ofp_table_mod. */ struct ofputil_table_mod { uint8_t table_id; /* ID of the table, 0xff indicates all tables. */ - enum ofputil_table_miss miss_config; + + /* OpenFlow 1.1 and 1.2 only. For other versions, ignored on encoding, + * decoded to OFPUTIL_TABLE_MISS_DEFAULT. */ + enum ofputil_table_miss miss; + + /* OpenFlow 1.4+ only. For other versions, ignored on encoding, decoded to + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ + enum ofputil_table_eviction eviction; + + /* OpenFlow 1.4+ only and optional even there; UINT32_MAX indicates + * absence. For other versions, ignored on encoding, decoded to + * UINT32_MAX.*/ + uint32_t eviction_flags; /* OFPTMPEF14_*. */ }; enum ofperr ofputil_decode_table_mod(const struct ofp_header *, @@ -623,16 +643,38 @@ enum ofperr ofputil_decode_table_mod(const struct ofp_header *, struct ofpbuf *ofputil_encode_table_mod(const struct ofputil_table_mod *, enum ofputil_protocol); -/* Abstract ofp_table_features. */ +/* Abstract ofp_table_features. + * + * This is used for all versions of OpenFlow, even though ofp_table_features + * was only introduced in OpenFlow 1.3, because earlier versions of OpenFlow + * include support for a subset of ofp_table_features through OFPST_TABLE (aka + * OFPMP_TABLE). */ struct ofputil_table_features { uint8_t table_id; /* Identifier of table. Lower numbered tables are consulted first. */ char name[OFP_MAX_TABLE_NAME_LEN]; ovs_be64 metadata_match; /* Bits of metadata table can match. */ ovs_be64 metadata_write; /* Bits of metadata table can write. */ - enum ofputil_table_miss miss_config; uint32_t max_entries; /* Max number of entries supported. */ + /* Flags. + * + * 'miss_config' is relevant for OpenFlow 1.1 and 1.2 only, because those + * versions include OFPTC_MISS_* flags in OFPST_TABLE. For other versions, + * it is decoded to OFPUTIL_TABLE_MISS_DEFAULT and ignored for encoding. + * + * 'supports_eviction' and 'supports_vacancy_events' are relevant only for + * OpenFlow 1.4 and later only. For OF1.4, they are boolean: 1 if + * supported, otherwise 0. For other versions, they are decoded as -1 and + * ignored for encoding. + * + * See the section "OFPTC_* Table Configuration" in DESIGN.md for more + * details of how OpenFlow has changed in this area. + */ + enum ofputil_table_miss miss_config; /* OF1.1 and 1.2 only. */ + int supports_eviction; /* OF1.4+ only. */ + int supports_vacancy_events; /* OF1.4+ only. */ + /* Table features related to instructions. There are two instances: * * - 'miss' reports features available in the table miss flow. diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index 495f787..b7bf734 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -246,9 +246,16 @@ struct oftable { struct hmap eviction_groups_by_id; struct heap eviction_groups_by_size; - /* Table configuration. */ + /* Flow table miss handling configuration. */ ATOMIC(enum ofputil_table_miss) miss_config; + /* Eviction is enabled if either the client (vswitchd) enables it or an + * OpenFlow controller enables it; thus, a nonzero value indicates that + * eivction is enabled. */ +#define EVICTION_CLIENT (1 << 0) /* Set to 1 if client enables eviction. */ +#define EVICTION_OPENFLOW (1 << 1) /* Set to 1 if OpenFlow enables eviction. */ + unsigned int eviction; + atomic_ulong n_matched; atomic_ulong n_missed; }; diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index fd14030..66edc87 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -83,11 +83,10 @@ static void oftable_set_name(struct oftable *, const char *name); static enum ofperr evict_rules_from_table(struct oftable *) OVS_REQUIRES(ofproto_mutex); -static void oftable_disable_eviction(struct oftable *) - OVS_REQUIRES(ofproto_mutex); -static void oftable_enable_eviction(struct oftable *, - const struct mf_subfield *fields, - size_t n_fields) +static void oftable_configure_eviction(struct oftable *, + unsigned int eviction, + const struct mf_subfield *fields, + size_t n_fields) OVS_REQUIRES(ofproto_mutex); /* A set of rules within a single OpenFlow table (oftable) that have the same @@ -1421,18 +1420,16 @@ ofproto_configure_table(struct ofproto *ofproto, int table_id, return; } - if (classifier_set_prefix_fields(&table->cls, s->prefix_fields, s->n_prefix_fields)) { /* XXX: Trigger revalidation. */ } ovs_mutex_lock(&ofproto_mutex); - if (s->groups) { - oftable_enable_eviction(table, s->groups, s->n_groups); - } else { - oftable_disable_eviction(table); - } + unsigned int new_eviction = (s->enable_eviction + ? table->eviction | EVICTION_CLIENT + : table->eviction & ~EVICTION_CLIENT); + oftable_configure_eviction(table, new_eviction, s->groups, s->n_groups); table->max_flows = s->max_flows; evict_rules_from_table(table); ovs_mutex_unlock(&ofproto_mutex); @@ -1695,7 +1692,7 @@ ofproto_run(struct ofproto *p) struct eviction_group *evg; struct rule *rule; - if (!table->eviction_fields) { + if (!table->eviction) { continue; } @@ -6550,10 +6547,34 @@ handle_group_mod(struct ofconn *ofconn, const struct ofp_header *oh) enum ofputil_table_miss ofproto_table_get_miss_config(const struct ofproto *ofproto, uint8_t table_id) { - enum ofputil_table_miss value; + enum ofputil_table_miss miss; + + atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &miss); + return miss; +} + +static void +table_mod__(struct oftable *oftable, + enum ofputil_table_miss miss, enum ofputil_table_eviction eviction) +{ + if (miss != OFPUTIL_TABLE_MISS_DEFAULT) { + atomic_store_relaxed(&oftable->miss_config, miss); + } + + unsigned int new_eviction = oftable->eviction; + if (eviction == OFPUTIL_TABLE_EVICTION_ON) { + new_eviction |= EVICTION_OPENFLOW; + } else if (eviction == OFPUTIL_TABLE_EVICTION_OFF) { + new_eviction &= ~EVICTION_OPENFLOW; + } - atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &value); - return value; + if (new_eviction != oftable->eviction) { + ovs_mutex_lock(&ofproto_mutex); + oftable_configure_eviction(oftable, new_eviction, + oftable->eviction_fields, + oftable->n_eviction_fields); + ovs_mutex_unlock(&ofproto_mutex); + } } static enum ofperr @@ -6561,18 +6582,33 @@ table_mod(struct ofproto *ofproto, const struct ofputil_table_mod *tm) { if (!check_table_id(ofproto, tm->table_id)) { return OFPERR_OFPTMFC_BAD_TABLE; - } else if (tm->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { - if (tm->table_id == OFPTT_ALL) { - int i; - for (i = 0; i < ofproto->n_tables; i++) { - atomic_store_relaxed(&ofproto->tables[i].miss_config, - tm->miss_config); + } + + /* Don't allow the eviction flags to be changed. OF1.4 says this is + * normal: "The OFPTMPT_EVICTION property usually cannot be modified using + * a OFP_TABLE_MOD request, because the eviction mechanism is switch + * defined". */ + if (tm->eviction_flags != UINT32_MAX + && tm->eviction_flags != (OFPTMPEF14_OTHER | OFPTMPEF14_IMPORTANCE + | OFPTMPEF14_LIFETIME)) { + return OFPERR_OFPTMFC_BAD_CONFIG; + } + + if (tm->table_id == OFPTT_ALL) { + struct oftable *oftable; + OFPROTO_FOR_EACH_TABLE (oftable, ofproto) { + if (!(oftable->flags & (OFTABLE_HIDDEN | OFTABLE_READONLY))) { + table_mod__(oftable, tm->miss, tm->eviction); } - } else { - atomic_store_relaxed(&ofproto->tables[tm->table_id].miss_config, - tm->miss_config); } + } else { + struct oftable *oftable = &ofproto->tables[tm->table_id]; + if (oftable->flags & OFTABLE_READONLY) { + return OFPERR_OFPTMFC_EPERM; + } + table_mod__(oftable, tm->miss, tm->eviction); } + return 0; } @@ -7178,7 +7214,7 @@ choose_rule_to_evict(struct oftable *table, struct rule **rulep) struct eviction_group *evg; *rulep = NULL; - if (!table->eviction_fields) { + if (!table->eviction) { return false; } @@ -7399,7 +7435,7 @@ eviction_group_add_rule(struct rule *rule) * so no additional protection is needed. */ has_timeout = rule->hard_timeout || rule->idle_timeout; - if (table->eviction_fields && has_timeout) { + if (table->eviction && has_timeout) { struct eviction_group *evg; evg = eviction_group_find(table, eviction_group_hash_rule(rule)); @@ -7421,6 +7457,8 @@ oftable_init(struct oftable *table) classifier_init(&table->cls, flow_segment_u64s); table->max_flows = UINT_MAX; table->n_flows = 0; + hmap_init(&table->eviction_groups_by_id); + heap_init(&table->eviction_groups_by_size); atomic_init(&table->miss_config, OFPUTIL_TABLE_MISS_DEFAULT); classifier_set_prefix_fields(&table->cls, default_prefix_fields, @@ -7437,9 +7475,13 @@ static void oftable_destroy(struct oftable *table) { ovs_assert(classifier_is_empty(&table->cls)); + ovs_mutex_lock(&ofproto_mutex); - oftable_disable_eviction(table); + oftable_configure_eviction(table, 0, NULL, 0); ovs_mutex_unlock(&ofproto_mutex); + + hmap_destroy(&table->eviction_groups_by_id); + heap_destroy(&table->eviction_groups_by_size); classifier_destroy(&table->cls); free(table->name); } @@ -7467,60 +7509,56 @@ oftable_set_name(struct oftable *table, const char *name) /* oftables support a choice of two policies when adding a rule would cause the * number of flows in the table to exceed the configured maximum number: either * they can refuse to add the new flow or they can evict some existing flow. - * This function configures the former policy on 'table'. */ -static void -oftable_disable_eviction(struct oftable *table) - OVS_REQUIRES(ofproto_mutex) -{ - if (table->eviction_fields) { - struct eviction_group *evg, *next; - - HMAP_FOR_EACH_SAFE (evg, next, id_node, - &table->eviction_groups_by_id) { - eviction_group_destroy(table, evg); - } - hmap_destroy(&table->eviction_groups_by_id); - heap_destroy(&table->eviction_groups_by_size); - - free(table->eviction_fields); - table->eviction_fields = NULL; - table->n_eviction_fields = 0; - } -} - -/* oftables support a choice of two policies when adding a rule would cause the - * number of flows in the table to exceed the configured maximum number: either - * they can refuse to add the new flow or they can evict some existing flow. * This function configures the latter policy on 'table', with fairness based * on the values of the 'n_fields' fields specified in 'fields'. (Specifying * 'n_fields' as 0 disables fairness.) */ static void -oftable_enable_eviction(struct oftable *table, - const struct mf_subfield *fields, size_t n_fields) +oftable_configure_eviction(struct oftable *table, unsigned int eviction, + const struct mf_subfield *fields, size_t n_fields) OVS_REQUIRES(ofproto_mutex) { struct rule *rule; - if (table->eviction_fields + if ((table->eviction != 0) == (eviction != 0) && n_fields == table->n_eviction_fields && (!n_fields || !memcmp(fields, table->eviction_fields, n_fields * sizeof *fields))) { - /* No change. */ + /* The set of eviction fields did not change. If 'eviction' changed, + * it remains nonzero, so that we can just update table->eviction + * without fussing with the eviction groups. */ + table->eviction = eviction; return; } - oftable_disable_eviction(table); - - table->n_eviction_fields = n_fields; - table->eviction_fields = xmemdup(fields, n_fields * sizeof *fields); - - table->eviction_group_id_basis = random_uint32(); + /* Destroy existing eviction groups, then destroy and recreate data + * structures to recover memory. */ + struct eviction_group *evg, *next; + HMAP_FOR_EACH_SAFE (evg, next, id_node, &table->eviction_groups_by_id) { + eviction_group_destroy(table, evg); + } + hmap_destroy(&table->eviction_groups_by_id); hmap_init(&table->eviction_groups_by_id); + heap_destroy(&table->eviction_groups_by_size); heap_init(&table->eviction_groups_by_size); - CLS_FOR_EACH (rule, cr, &table->cls) { - eviction_group_add_rule(rule); + /* Replace eviction groups by the new ones, if there is a change. Free the + * old fields only after allocating the new ones, because 'fields == + * table->eviction_fields' is possible. */ + struct mf_subfield *old_fields = table->eviction_fields; + table->n_eviction_fields = n_fields; + table->eviction_fields = (fields + ? xmemdup(fields, n_fields * sizeof *fields) + : NULL); + free(old_fields); + + /* Add the new eviction groups, if enabled. */ + table->eviction = eviction; + if (table->eviction) { + table->eviction_group_id_basis = random_uint32(); + CLS_FOR_EACH (rule, cr, &table->cls) { + eviction_group_add_rule(rule); + } } } diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h index 7dc1874..7504027 100644 --- a/ofproto/ofproto.h +++ b/ofproto/ofproto.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -459,14 +459,18 @@ struct ofproto_table_settings { char *name; /* Name exported via OpenFlow or NULL. */ unsigned int max_flows; /* Maximum number of flows or UINT_MAX. */ - /* These members determine the handling of an attempt to add a flow that - * would cause the table to have more than 'max_flows' flows. + /* These members, together with OpenFlow OFPT_TABLE_MOD, determine the + * handling of an attempt to add a flow that would cause the table to have + * more than 'max_flows' flows: * - * If 'groups' is NULL, overflows will be rejected with an error. + * - If 'enable_eviction' is false and OFPT_TABLE_MOD does not enable + * eviction, overflows will be rejected with an error. * - * If 'groups' is nonnull, an overflow will cause a flow to be removed. - * The flow to be removed is chosen to give fairness among groups - * distinguished by different values for the subfields within 'groups'. */ + * - If 'enable_eviction' is true or OFPT_TABLE_MOD enables eviction, an + * overflow will cause a flow to be removed. The flow to be removed + * is chosen to give fairness among groups distinguished by different + * values for the 'n_groups' subfields within 'groups'. */ + bool enable_eviction; struct mf_subfield *groups; size_t n_groups; diff --git a/tests/ofp-print.at b/tests/ofp-print.at index 39a5bbb..e08a201 100644 --- a/tests/ofp-print.at +++ b/tests/ofp-print.at @@ -1132,7 +1132,7 @@ AT_KEYWORDS([ofp-print]) AT_CHECK([ovs-ofctl ofp-print "\ 05 11 00 10 00 00 00 02 02 00 00 00 00 00 00 00 \ " 3], [0], [dnl -OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2 +OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2, eviction=off ]) AT_CLEANUP diff --git a/tests/ofproto.at b/tests/ofproto.at index 7467ca0..4d96634 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -1792,11 +1792,13 @@ OVS_VSWITCHD_START # Configure a maximum of 4 flows. AT_CHECK( [ovs-vsctl \ - -- --id=@t0 create Flow_Table name=evict flow-limit=4 overflow-policy=evict \ + -- --id=@t0 create Flow_Table name=evict flow-limit=4 \ -- set bridge br0 flow_tables:0=@t0 \ | ${PERL} $srcdir/uuidfilt.pl], [0], [<0> ]) +# Use mod-table to turn on eviction just to demonstrate that it works. +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 evict]) # Add 4 flows. for in_port in 4 3 2 1; do ovs-ofctl -O Openflow14 add-flow br0 importance=$((in_port + 30)),priority=$((in_port + 5)),hard_timeout=$((in_port + 500)),actions=drop @@ -1818,7 +1820,7 @@ AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | ofctl_strip | sort], [0], [dn OFPST_FLOW reply (OF1.4): ]) # Disable the Eviction configuration. -AT_CHECK([ovs-vsctl set Flow_Table evict overflow-policy=refuse]) +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 noevict]) # Adding another flow will cause the system to give error for FULL TABLE. AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 hard_timeout=506,importance=36,priority=11,actions=drop],[1], [], [stderr]) AT_CHECK([head -n 1 stderr | ofctl_strip], [0], @@ -1846,23 +1848,6 @@ AT_CHECK([ovs-ofctl mod-flows br0 in_port=5,actions=drop], [1], [], [stderr]) AT_CHECK([head -n 1 stderr | ofctl_strip], [0], [OFPT_ERROR: OFPFMFC_TABLE_FULL ]) -# Now set the eviction on timeout basis. -AT_CHECK( - [ovs-vsctl \ - -- --id=@t0 create Flow_Table flow-limit=4 overflow-policy=evict \ - -- set bridge br0 flow_tables:0=@t0 \ - | ${PERL} $srcdir/uuidfilt.pl], - [0], [<0> -]) -#Now add a new flow -AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 importance=37,hard_timeout=507,priority=11,in_port=6,actions=drop]) -AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | ofctl_strip | sort], [0], [dnl - hard_timeout=503, importance=33, priority=8 actions=drop - hard_timeout=504, importance=34, priority=9 actions=drop - hard_timeout=505, importance=35, priority=10,in_port=2 actions=NORMAL - hard_timeout=507, importance=37, priority=11,in_port=6 actions=drop -OFPST_FLOW reply (OF1.4): -]) OVS_VSWITCHD_STOP AT_CLEANUP diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in index 63c2ecc..92edd91 100644 --- a/utilities/ovs-ofctl.8.in +++ b/utilities/ovs-ofctl.8.in @@ -62,20 +62,6 @@ Prints to the console statistics for each of the flow tables used by \fBdump\-table\-features \fIswitch\fR Prints to the console features for each of the flow tables used by \fIswitch\fR. -. -.IP "\fBmod\-table \fIswitch\fR \fItable_id\fR \fIflow_miss_handling\fR" -An OpenFlow 1.0 switch looks up each packet that arrives at the switch -in table 0, then in table 1 if there is no match in table 0, then in -table 2, and so on until the packet finds a match in some table. -Finally, if no match was found, the switch sends the packet to the -controller -.IP -OpenFlow 1.1 and later offer more flexibility. This command -configures the flow table miss handling configuration for table -\fItable_id\fR in \fIswitch\fR. \fItable_id\fR may be an OpenFlow -table number between 0 and 254, inclusive, or the keyword \fBALL\fR to -modify all tables. \fIflow_miss_handling\fR may be any one of the -following: .RS .IP \fBdrop\fR Drop the packet. @@ -87,6 +73,21 @@ tables other than the last one.) Send to controller. (This is how an OpenFlow 1.0 switch always handles packets that do not match any flow in the last table.) .RE +.IP +In OpenFlow 1.4 and later (which must be enabled with the \fB\-O\fR +option) only, \fBmod\-table\fR configures the behavior when a +controller attempts to add a flow to a flow table that is full. The +following \fIsetting\fR values are available: +.RS +.IP \fBevict\fR +Delete some existing flow from the flow table, according to the +algorithm described for the \fBFlow_Table\fR table in +\fBovs-vswitchd.conf.db\fR(5). +.IP \fBnoevict\fR +Refuse to add the new flow. (Eviction might still be enabled through +the \fBoverflow_policy\fR oclumn in the \fBFlow_Table\fR table +documented in \fBovs-vswitchd.conf.db\fR(5).) +.RE . .TP \fBdump\-ports \fIswitch\fR [\fInetdev\fR] diff --git a/utilities/ovs-ofctl.c b/utilities/ovs-ofctl.c index 8df79b8..bcd43c2 100644 --- a/utilities/ovs-ofctl.c +++ b/utilities/ovs-ofctl.c @@ -340,8 +340,11 @@ usage(void) " dump-table-features SWITCH print table features\n" " mod-port SWITCH IFACE ACT modify port behavior\n" " mod-table SWITCH MOD modify flow table behavior\n" + " OF1.1/1.2 MOD: controller, continue, drop\n" + " OF1.4+ MOD: evict, noevict\n" " get-frags SWITCH print fragment handling behavior\n" " set-frags SWITCH FRAG_MODE set fragment handling behavior\n" + " FRAG_MODE: normal, drop, reassemble, nx-match\n" " dump-ports SWITCH [PORT] print port statistics\n" " dump-ports-desc SWITCH [PORT] print port descriptions\n" " dump-flows SWITCH print all flow entries\n" @@ -1845,35 +1848,28 @@ found: static void ofctl_mod_table(struct ovs_cmdl_context *ctx) { - enum ofputil_protocol protocol, usable_protocols; + uint32_t usable_versions; struct ofputil_table_mod tm; struct vconn *vconn; char *error; - int i; - error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3], &usable_protocols); + error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3], + &usable_versions); if (error) { ovs_fatal(0, "%s", error); } - protocol = open_vconn(ctx->argv[1], &vconn); - if (!(protocol & usable_protocols)) { - for (i = 0; i < sizeof(enum ofputil_protocol) * CHAR_BIT; i++) { - enum ofputil_protocol f = 1 << i; - if (f != protocol - && f & usable_protocols - && try_set_protocol(vconn, f, &protocol)) { - protocol = f; - break; - } - } - } - - if (!(protocol & usable_protocols)) { - char *usable_s = ofputil_protocols_to_string(usable_protocols); - ovs_fatal(0, "Switch does not support table mod message(%s)", usable_s); + uint32_t allowed_versions = get_allowed_ofp_versions(); + if (!(allowed_versions & usable_versions)) { + struct ds versions = DS_EMPTY_INITIALIZER; + ofputil_format_version_bitmap_names(&versions, allowed_versions); + ovs_fatal(0, "table_mod '%s' requires one of the OpenFlow " + "versions %s but none is enabled (use -O)", + ctx->argv[3], ds_cstr(&versions)); } + mask_allowed_ofp_versions(usable_versions); + enum ofputil_protocol protocol = open_vconn(ctx->argv[1], &vconn); transact_noreply(vconn, ofputil_encode_table_mod(&tm, protocol)); vconn_close(vconn); } diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index d48cf7f..dfb581e 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -3603,6 +3603,7 @@ bridge_configure_tables(struct bridge *br) s.name = NULL; s.max_flows = UINT_MAX; s.groups = NULL; + s.enable_eviction = false; s.n_groups = 0; s.n_prefix_fields = 0; memset(s.prefix_fields, ~0, sizeof(s.prefix_fields)); @@ -3614,9 +3615,10 @@ bridge_configure_tables(struct bridge *br) if (cfg->n_flow_limit && *cfg->flow_limit < UINT_MAX) { s.max_flows = *cfg->flow_limit; } - if (cfg->overflow_policy - && !strcmp(cfg->overflow_policy, "evict")) { + s.enable_eviction = (cfg->overflow_policy + && !strcmp(cfg->overflow_policy, "evict")); + if (cfg->n_groups) { s.groups = xmalloc(cfg->n_groups * sizeof *s.groups); for (k = 0; k < cfg->n_groups; k++) { const char *string = cfg->groups[k]; @@ -3636,6 +3638,7 @@ bridge_configure_tables(struct bridge *br) } } } + /* Prefix lookup fields. */ s.n_prefix_fields = 0; for (k = 0; k < cfg->n_prefixes; k++) { diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 5a917aa..f1e5705 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3039,46 +3039,27 @@ dump-tables</code>. The name does not affect switch behavior. </column> - <column name="flow_limit"> - If set, limits the number of flows that may be added to the table. Open - vSwitch may limit the number of flows in a table for other reasons, - e.g. due to hardware limitations or for resource availability or - performance reasons. - </column> - - <column name="overflow_policy"> + <group title="Eviction Policy"> <p> - Controls the switch's behavior when an OpenFlow flow table modification - request would add flows in excess of <ref column="flow_limit"/>. The - supported values are: + Open vSwitch supports limiting the number of flows that may be + installed in a flow table, via the <ref column="flow_limit"/> column. + When adding a flow would exceed this limit, by default Open vSwitch + reports an error, but there are two ways to configure Open vSwitch to + instead delete (``evict'') a flow to make room for the new one: </p> - <dl> - <dt><code>refuse</code></dt> - <dd> - Refuse to add the flow or flows. This is also the default policy - when <ref column="overflow_policy"/> is unset. - </dd> - - <dt><code>evict</code></dt> - <dd> - Delete the flow that will expire soonest. See <ref column="groups"/> - for details. - </dd> - </dl> - </column> + <ul> + <li> + Set the <ref column="overflow_policy"/> column to <code>evict</code>. + </li> - <column name="groups"> - <p> - When <ref column="overflow_policy"/> is <code>evict</code>, this - controls how flows are chosen for eviction when the flow table would - otherwise exceed <ref column="flow_limit"/> flows. Its value is a set - of NXM fields or sub-fields, each of which takes one of the forms - <code><var>field</var>[]</code> or - <code><var>field</var>[<var>start</var>..<var>end</var>]</code>, - e.g. <code>NXM_OF_IN_PORT[]</code>. Please see - <code>nicira-ext.h</code> for a complete list of NXM field names. - </p> + <li> + Send an OpenFlow 1.4+ ``table mod request'' to enable eviction for + the flow table (e.g. <code>ovs-ofctl -O OpenFlow14 mod-table br0 0 + evict</code> to enable eviction on flow table 0 of bridge + <code>br0</code>). + </li> + </ul> <p> When a flow must be evicted due to overflow, the flow to evict is @@ -3118,95 +3099,138 @@ </ol> <p> - The eviction process only considers flows that have an idle timeout or - a hard timeout. That is, eviction never deletes permanent flows. + The eviction process only considers flows that have an idle timeout + or a hard timeout. That is, eviction never deletes permanent flows. (Permanent flows do count against <ref column="flow_limit"/>.) </p> - <p> - Open vSwitch ignores any invalid or unknown field specifications. - </p> + <column name="flow_limit"> + If set, limits the number of flows that may be added to the table. + Open vSwitch may limit the number of flows in a table for other + reasons, e.g. due to hardware limitations or for resource availability + or performance reasons. + </column> - <p> - When <ref column="overflow_policy"/> is not <code>evict</code>, this - column has no effect. - </p> - </column> + <column name="overflow_policy"> + <p> + Controls the switch's behavior when an OpenFlow flow table + modification request would add flows in excess of <ref + column="flow_limit"/>. The supported values are: + </p> - <column name="prefixes"> - <p> - This string set specifies which fields should be used for - address prefix tracking. Prefix tracking allows the - classifier to skip rules with longer than necessary prefixes, - resulting in better wildcarding for datapath flows. - </p> - <p> - Prefix tracking may be beneficial when a flow table contains - matches on IP address fields with different prefix lengths. - For example, when a flow table contains IP address matches on - both full addresses and proper prefixes, the full address - matches will typically cause the datapath flow to un-wildcard - the whole address field (depending on flow entry priorities). - In this case each packet with a different address gets handed - to the userspace for flow processing and generates its own - datapath flow. With prefix tracking enabled for the address - field in question packets with addresses matching shorter - prefixes would generate datapath flows where the irrelevant - address bits are wildcarded, allowing the same datapath flow - to handle all the packets within the prefix in question. In - this case many userspace upcalls can be avoided and the - overall performance can be better. - </p> - <p> - This is a performance optimization only, so packets will - receive the same treatment with or without prefix tracking. - </p> - <p> - The supported fields are: <code>tun_id</code>, - <code>tun_src</code>, <code>tun_dst</code>, - <code>nw_src</code>, <code>nw_dst</code> (or aliases - <code>ip_src</code> and <code>ip_dst</code>), - <code>ipv6_src</code>, and <code>ipv6_dst</code>. (Using this - feature for <code>tun_id</code> would only make sense if the - tunnel IDs have prefix structure similar to IP addresses.) - </p> + <dl> + <dt><code>refuse</code></dt> + <dd> + Refuse to add the flow or flows. This is also the default policy + when <ref column="overflow_policy"/> is unset. + </dd> - <p> - By default, the <code>prefixes=ip_dst,ip_src</code> are used - on each flow table. This instructs the flow classifier to - track the IP destination and source addresses used by the - rules in this specific flow table. - </p> + <dt><code>evict</code></dt> + <dd> + Delete a flow chosen according to the algorithm described above. + </dd> + </dl> + </column> - <p> - The keyword <code>none</code> is recognized as an explicit - override of the default values, causing no prefix fields to be - tracked. - </p> + <column name="groups"> + <p> + When <ref column="overflow_policy"/> is <code>evict</code>, this + controls how flows are chosen for eviction when the flow table would + otherwise exceed <ref column="flow_limit"/> flows. Its value is a + set of NXM fields or sub-fields, each of which takes one of the forms + <code><var>field</var>[]</code> or + <code><var>field</var>[<var>start</var>..<var>end</var>]</code>, + e.g. <code>NXM_OF_IN_PORT[]</code>. Please see + <code>nicira-ext.h</code> for a complete list of NXM field names. + </p> - <p> - To set the prefix fields, the flow table record needs to - exist: - </p> + <p> + Open vSwitch ignores any invalid or unknown field specifications. + </p> - <dl> - <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 create Flow_Table name=table0</code></dt> - <dd> - Creates a flow table record for the OpenFlow table number 0. - </dd> + <p> + When eviction is not enabled, via <ref column="overflow_policy"/> or + an OpenFlow 1.4+ ``table mod,'' this column has no effect. + </p> + </column> + </group> - <dt><code>ovs-vsctl set Flow_Table table0 prefixes=ip_dst,ip_src</code></dt> - <dd> - Enables prefix tracking for IP source and destination - address fields. - </dd> - </dl> + <group title="Classifier Optimization"> + <column name="prefixes"> + <p> + This string set specifies which fields should be used for + address prefix tracking. Prefix tracking allows the + classifier to skip rules with longer than necessary prefixes, + resulting in better wildcarding for datapath flows. + </p> + <p> + Prefix tracking may be beneficial when a flow table contains + matches on IP address fields with different prefix lengths. + For example, when a flow table contains IP address matches on + both full addresses and proper prefixes, the full address + matches will typically cause the datapath flow to un-wildcard + the whole address field (depending on flow entry priorities). + In this case each packet with a different address gets handed + to the userspace for flow processing and generates its own + datapath flow. With prefix tracking enabled for the address + field in question packets with addresses matching shorter + prefixes would generate datapath flows where the irrelevant + address bits are wildcarded, allowing the same datapath flow + to handle all the packets within the prefix in question. In + this case many userspace upcalls can be avoided and the + overall performance can be better. + </p> + <p> + This is a performance optimization only, so packets will + receive the same treatment with or without prefix tracking. + </p> + <p> + The supported fields are: <code>tun_id</code>, + <code>tun_src</code>, <code>tun_dst</code>, + <code>nw_src</code>, <code>nw_dst</code> (or aliases + <code>ip_src</code> and <code>ip_dst</code>), + <code>ipv6_src</code>, and <code>ipv6_dst</code>. (Using this + feature for <code>tun_id</code> would only make sense if the + tunnel IDs have prefix structure similar to IP addresses.) + </p> - <p> - There is a maximum number of fields that can be enabled for any - one flow table. Currently this limit is 3. - </p> - </column> + <p> + By default, the <code>prefixes=ip_dst,ip_src</code> are used + on each flow table. This instructs the flow classifier to + track the IP destination and source addresses used by the + rules in this specific flow table. + </p> + + <p> + The keyword <code>none</code> is recognized as an explicit + override of the default values, causing no prefix fields to be + tracked. + </p> + + <p> + To set the prefix fields, the flow table record needs to + exist: + </p> + + <dl> + <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 create Flow_Table name=table0</code></dt> + <dd> + Creates a flow table record for the OpenFlow table number 0. + </dd> + + <dt><code>ovs-vsctl set Flow_Table table0 prefixes=ip_dst,ip_src</code></dt> + <dd> + Enables prefix tracking for IP source and destination + address fields. + </dd> + </dl> + + <p> + There is a maximum number of fields that can be enabled for any + one flow table. Currently this limit is 3. + </p> + </column> + </group> <group title="Common Columns"> The overall purpose of these columns is described under <code>Common -- 2.1.3 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev