This is still more or less untested. --- include/openflow/nicira-ext.h | 193 ++++++++++++++- lib/automake.mk | 2 + lib/learn.c | 585 +++++++++++++++++++++++++++++++++++++++++ lib/learn.h | 40 +++ lib/ofp-parse.c | 5 + lib/ofp-print.c | 5 + lib/ofp-util.c | 6 + lib/ofp-util.def | 1 + ofproto/ofproto-dpif.c | 26 ++ ofproto/ofproto-provider.h | 14 + ofproto/ofproto.c | 62 +++-- utilities/ovs-ofctl.8.in | 57 ++++ 12 files changed, 971 insertions(+), 25 deletions(-) create mode 100644 lib/learn.c create mode 100644 lib/learn.h
diff --git a/include/openflow/nicira-ext.h b/include/openflow/nicira-ext.h index 4fab6f1..08da126 100644 --- a/include/openflow/nicira-ext.h +++ b/include/openflow/nicira-ext.h @@ -281,7 +281,8 @@ enum nx_action_subtype { NXAST_BUNDLE, /* struct nx_action_bundle */ NXAST_BUNDLE_LOAD, /* struct nx_action_bundle */ NXAST_RESUBMIT_TABLE, /* struct nx_action_resubmit */ - NXAST_OUTPUT_REG /* struct nx_action_output_reg */ + NXAST_OUTPUT_REG, /* struct nx_action_output_reg */ + NXAST_LEARN /* struct nx_action_learn */ }; /* Header for Nicira-defined actions. */ @@ -656,6 +657,196 @@ enum nx_mp_algorithm { NX_MP_ALG_ITER_HASH /* Iterative Hash. */ }; +/* Action structure for NXAST_LEARN. + * + * This action adds or modifies a flow in an OpenFlow table, similar to + * OFPT_FLOW_MOD with OFPFC_MODIFY_STRICT as 'command'. The new flow has the + * specified idle timeout, hard timeout, priority, and flags. The new flow's + * match criteria and actions are built by applying each of the series of + * flow_mod_spec elements included as part of the action. + * + * A flow_mod_spec starts with a 16-bit header. A header that is all-bits-0 is + * a no-op used for padding the action as a whole to a multiple of 8 bytes in + * length. Otherwise, the flow_mod_spec can be thought of as copying 'n_bits' + * bits from a source to a destination. In this case, the header contains + * multiple fields: + * + * 15 14 13 12 11 10 0 + * +------+---+------+---------------------------------+ + * | 0 |src| dst | n_bits | + * +------+---+------+---------------------------------+ + * + * The meaning and format of a flow_mod_spec depends on 'src' and 'dst'. The + * following table summarizes the meaning of each possible combination. + * Details follow the table: + * + * src dst meaning + * --- --- ---------------------------------------------------------- + * 0 0 Add match criteria based on value in a field. + * 1 0 Add match criteria based on an immediate value. + * 0 1 Add NXAST_REG_LOAD action to copy field into a different field. + * 1 1 Add NXAST_REG_LOAD action to load immediate value into a field. + * 0 2 Add OFPAT_OUTPUT action to output to port from specified field. + * All other combinations are undefined and not allowed. + * + * The flow_mod_spec header is followed by a source specification and a + * destination specification. The format and meaning of the source + * specification depends on 'src': + * + * - If 'src' is 0, the source bits are taken from a field in the flow to + * which this action is attached. (This should be a wildcarded field. If + * its value is fully specified then the source bits being copied have + * constant values.) + * + * The source specification is an ovs_be32 'field' and an ovs_be16 'ofs'. + * 'field' is an nxm_header with nxm_hasmask=0, and 'ofs' the starting bit + * offset within that field. The source bits are field[ofs:ofs+n_bits-1]. + * 'field' and 'ofs' are subject to the same restrictions as the source + * field in NXAST_REG_MOVE. + * + * - If 'src' is 1, the source bits are a constant value. The source + * specification is (n_bits+15)/16*2 bytes long. Taking those bytes as a + * number in network order, the source bits are the 'n_bits' + * least-significant bits. The switch will report an error if other bits + * in the constant are nonzero. + * + * The flow_mod_spec destination specification, for 'dst' of 0 or 1, is an + * ovs_be32 'field' and an ovs_be16 'ofs'. 'field' is an nxm_header with + * nxm_hasmask=0 and 'ofs' is a starting bit offset within that field. The + * meaning of the flow_mod_spec depends on 'dst': + * + * - If 'dst' is 0, the flow_mod_spec specifies match criteria for the new + * flow. The new flow matches only if bits field[ofs:ofs+n_bits-1] in a + * packet equal the source bits. 'field' may be any nxm_header with + * nxm_hasmask=0 that is allowed in NXT_FLOW_MOD. + * + * Order is significant. Earlier flow_mod_specs must satisfy any + * prerequisites for matching fields specified later, by copying constant + * values into prerequisite fields. + * + * The switch will reject flow_mod_specs that do not satisfy NXM masking + * restrictions. + * + * - If 'dst' is 1, the flow_mod_spec specifies an NXAST_REG_LOAD action for + * the new flow. The new flow copies the source bits into + * field[ofs:ofs+n_bits-1]. Actions are executed in the same order as the + * flow_mod_specs. + * + * The flow_mod_spec destination spec for 'dst' of 2 (when 'src' is 0) is + * empty. It has the following meaning: + * + * - The flow_mod_spec specifies an OFPAT_OUTPUT action for the new flow. + * The new flow outputs to the OpenFlow port specified by the source field. + * Of the special output ports with value OFPP_MAX or larger, OFPP_IN_PORT, + * OFPP_FLOOD, OFPP_LOCAL, and OFPP_ALL are supported. Other special ports + * may not be used. + * + * Resource Management + * ------------------- + * + * A switch has a finite amount of flow table space available for learning. + * When this space is exhausted, no new learning table entries will be learned + * until some existing flow table entries expire. The controller should be + * prepared to handle this by flooding (which can be implemented as a + * low-priority flow). + * + * Examples + * -------- + * + * The following examples give a prose description of the flow_mod_specs along + * with informal notation for how those would be represented and a hex dump of + * the bytes that would be required. + * + * These examples could work with various nx_action_learn parameters. Typical + * values would be idle_timeout=60, hard_timeout=0, + * priority=OFP_DEFAULT_PRIORITY, flags=0, table_id=10. + * + * 1. Learn input port based on the source MAC, with lookup into + * NXM_NX_REG1[16:31] by resubmit to in_port=99: + * + * Match on in_port=99: + * ovs_be16(src=1, dst=0, n_bits=16), 20 10 + * ovs_be16(99), 00 63 + * ovs_be32(NXM_OF_IN_PORT), ovs_be16(0) 00 00 00 02 00 00 + * + * Match Ethernet destination on Ethernet source from packet: + * ovs_be16(src=0, dst=0, n_bits=48), 00 30 + * ovs_be32(NXM_OF_ETH_SRC), ovs_be16(0) 00 00 04 06 00 00 + * ovs_be32(NXM_OF_ETH_DST), ovs_be16(0) 00 00 02 06 00 00 + * + * Set NXM_NX_REG1[16:31] to the packet's input port: + * ovs_be16(src=0, dst=1, n_bits=16), 08 10 + * ovs_be32(NXM_OF_IN_PORT), ovs_be16(0) 00 00 00 02 00 00 + * ovs_be32(NXM_NX_REG1), ovs_be16(16) 00 01 02 04 00 10 + * + * Given a packet that arrived on port A with Ethernet source address B, + * this would set up the flow "in_port=99, dl_dst=B, + * actions=load:A->NXM_NX_REG1[16..31]". + * + * In syntax accepted by ovs-ofctl, this action is: learn(in_port=99, + * NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], NXM_OF_IN_PORT[]->NXM_NX_REG1[16..31]) + * + * 2. Output to input port based on the source MAC and VLAN VID, with lookup + * into NXM_NX_REG1[16:31]: + * + * Match on same VLAN ID as packet: + * ovs_be16(src=0, dst=0, n_bits=12), 00 0c + * ovs_be32(NXM_OF_VLAN_TCI), ovs_be16(0) 00 00 08 02 00 00 + * ovs_be32(NXM_OF_VLAN_TCI), ovs_be16(0) 00 00 08 02 00 00 + * + * Match Ethernet destination on Ethernet source from packet: + * ovs_be16(src=0, dst=0, n_bits=48), 00 30 + * ovs_be32(NXM_OF_ETH_SRC), ovs_be16(0) 00 00 04 06 00 00 + * ovs_be32(NXM_OF_ETH_DST), ovs_be16(0) 00 00 02 06 00 00 + * + * Output to the packet's input port: + * ovs_be16(src=0, dst=2, n_bits=16), 10 10 + * ovs_be32(NXM_OF_IN_PORT), ovs_be16(0) 00 00 00 02 00 00 + * + * Given a packet that arrived on port A with Ethernet source address B in + * VLAN C, this would set up the flow "dl_dst=B, vlan_vid=C, + * actions=output:A". + * + * In syntax accepted by ovs-ofctl, this action is: + * learn(NXM_OF_VLAN_TCI[0..11], NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], + * output:NXM_OF_IN_PORT[]) + * + * Advice to Controller Implementors + * --------------------------------- + * + * For best performance, segregate learned flows into a table that is not used + * for any other flows except possibly for a lowest-priority "catch-all" flow + * (a flow with no match criteria). If different learning actions specify + * different match criteria, use different tables for the learned flows. + */ +struct nx_action_learn { + ovs_be16 type; /* OFPAT_VENDOR. */ + ovs_be16 len; /* At least 24. */ + ovs_be32 vendor; /* NX_VENDOR_ID. */ + ovs_be16 subtype; /* NXAST_LEARN. */ + ovs_be16 idle_timeout; /* Idle time before discarding (seconds). */ + ovs_be16 hard_timeout; /* Max time before discarding (seconds). */ + ovs_be16 priority; /* Priority level of flow entry. */ + ovs_be16 flags; /* Either 0 or OFPFF_SEND_FLOW_REM. */ + uint8_t table_id; /* Table to insert flow entry. */ + uint8_t pad[5]; /* Must be zero. */ + /* Followed by a sequence of flow_mod_spec elements, as described above, + * until the end of the action is reached. */ +}; +OFP_ASSERT(sizeof(struct nx_action_learn) == 24); + +#define NX_LEARN_N_BITS_MASK 0x3ff + +#define NX_LEARN_SRC_FIELD (0 << 13) /* Copy from field. */ +#define NX_LEARN_SRC_IMMEDIATE (1 << 13) /* Copy from immediate value. */ +#define NX_LEARN_SRC_MASK (1 << 13) + +#define NX_LEARN_DST_MATCH (0 << 11) /* Add match criterion. */ +#define NX_LEARN_DST_LOAD (1 << 11) /* Add NXAST_REG_LOAD action. */ +#define NX_LEARN_DST_OUTPUT (2 << 11) /* Add OFPAT_OUTPUT action. */ +#define NX_LEARN_DST_RESERVED (3 << 11) +#define NX_LEARN_DST_MASK (3 << 11) + /* Action structure for NXAST_AUTOPATH. * * This action performs the following steps in sequence: diff --git a/lib/automake.mk b/lib/automake.mk index bd7e095..b925991 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -67,6 +67,8 @@ lib_libopenvswitch_a_SOURCES = \ lib/lacp.h \ lib/leak-checker.c \ lib/leak-checker.h \ + lib/learn.c \ + lib/learn.h \ lib/learning-switch.c \ lib/learning-switch.h \ lib/list.c \ diff --git a/lib/learn.c b/lib/learn.c new file mode 100644 index 0000000..e49e2c2 --- /dev/null +++ b/lib/learn.c @@ -0,0 +1,585 @@ +/* + * Copyright (c) 2011 Nicira Networks. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include <config.h> + +#include "learn.h" + +#include "byte-order.h" +#include "dynamic-string.h" +#include "meta-flow.h" +#include "nx-match.h" +#include "ofp-util.h" +#include "ofpbuf.h" +#include "openflow/openflow.h" +#include "unaligned.h" + +static ovs_be16 +get_be16(const void **pp) +{ + const ovs_be16 *p = *pp; + ovs_be16 value = *p; + *pp = p + 1; + return value; +} + +static ovs_be32 +get_be32(const void **pp) +{ + const ovs_be32 *p = *pp; + ovs_be32 value = get_unaligned_be32(p); + *pp = p + 1; + return value; +} + +static uint64_t +get_bits(int n_bits, const void **p) +{ + int n_segs = DIV_ROUND_UP(n_bits, 16); + uint64_t value; + + value = 0; + while (n_segs-- > 0) { + value = (value << 16) | ntohs(get_be16(p)); + } + return value; +} + +static bool +is_all_zeros(const uint8_t *field, size_t length) +{ + size_t i; + + for (i = 0; i < length; i++) { + if (field[i] != 0x00) { + return false; + } + } + return true; +} + +int +learn_check(const struct nx_action_learn *learn, const struct flow *flow) +{ + struct cls_rule rule; + const void *p, *end; + + cls_rule_init_catchall(&rule, 0); + + if (learn->flags & ~htons(OFPFF_SEND_FLOW_REM) + || !is_all_zeros(learn->pad, sizeof learn->pad) + || learn->table_id == 0xff) { + return ofp_mkerr(OFPET_BAD_ACTION, OFPBAC_BAD_ARGUMENT); + } + + end = (char *) learn + ntohs(learn->len); + for (p = learn + 1; p != end; ) { + uint16_t header = ntohs(get_be16(&p)); + int n_bits = header & NX_LEARN_N_BITS_MASK; + int src_type = header & NX_LEARN_SRC_MASK; + int dst_type = header & NX_LEARN_DST_MASK; + + uint64_t value; + int min_len; + int error; + + if (!header) { + break; + } + + /* Check for valid src and dst type combination. */ + if (dst_type == NX_LEARN_DST_MATCH || + dst_type == NX_LEARN_DST_LOAD || + (dst_type == NX_LEARN_DST_OUTPUT && + src_type == NX_LEARN_SRC_FIELD)) { + /* OK. */ + } else { + return ofp_mkerr(OFPET_BAD_ACTION, OFPBAC_BAD_ARGUMENT); + } + + /* Check that the arguments don't overrun the end of the action. */ + min_len = 0; + if (src_type == NX_LEARN_SRC_FIELD) { + min_len += sizeof(ovs_be32); /* src_field */ + min_len += sizeof(ovs_be16); /* src_ofs */ + } else { + min_len += DIV_ROUND_UP(n_bits, 16); + } + if (dst_type == NX_LEARN_DST_MATCH || + dst_type == NX_LEARN_DST_LOAD) { + min_len += sizeof(ovs_be32); /* dst_field */ + min_len += sizeof(ovs_be16); /* dst_ofs */ + } + if ((char *) end - (char *) p < min_len) { + return ofp_mkerr(OFPET_BAD_ACTION, OFPBAC_BAD_LEN); + } + + /* Check the source. */ + if (src_type == NX_LEARN_SRC_FIELD) { + ovs_be32 src_field = get_be32(&p); + int src_ofs = ntohs(get_be16(&p)); + + error = nxm_src_check(src_field, src_ofs, n_bits, flow); + if (error) { + return error; + } + value = 0; + } else { + value = get_bits(n_bits, &p); + } + + /* Check the destination. */ + if (dst_type == NX_LEARN_DST_MATCH || dst_type == NX_LEARN_DST_LOAD) { + ovs_be32 dst_field = get_be32(&p); + int dst_ofs = ntohs(get_be16(&p)); + int error; + + error = nxm_dst_check(dst_field, dst_ofs, n_bits, flow); + if (error) { + return error; + } + + if (dst_type == NX_LEARN_DST_MATCH + && src_type == NX_LEARN_SRC_IMMEDIATE) { + mf_set_subfield(nxm_field_to_mf_field(ntohl(dst_field)), value, + dst_ofs, n_bits, &rule); + } + } + } + if (!is_all_zeros(p, (char *) end - (char *) p)) { + return ofp_mkerr(OFPET_BAD_ACTION, OFPBAC_BAD_ARGUMENT); + } + + return 0; +} + +void +learn_execute(const struct nx_action_learn *learn, const struct flow *flow, + struct ofputil_flow_mod *fm) +{ + const void *p, *end; + struct ofpbuf actions; + + cls_rule_init_catchall(&fm->cr, ntohs(learn->priority)); + fm->cookie = htonll(0); /* XXX */ + fm->table_id = learn->table_id; + fm->command = OFPFC_MODIFY_STRICT; + fm->idle_timeout = ntohs(learn->idle_timeout); + fm->hard_timeout = ntohs(learn->hard_timeout); + fm->buffer_id = UINT32_MAX; + fm->out_port = OFPP_NONE; + fm->flags = ntohs(learn->flags) & OFPFF_SEND_FLOW_REM; + fm->actions = NULL; + fm->n_actions = 0; + + ofpbuf_init(&actions, 64); + + for (p = learn + 1, end = (char *) learn + ntohs(learn->len); p != end; ) { + uint16_t header = ntohs(get_be16(&p)); + int n_bits = header & NX_LEARN_N_BITS_MASK; + int src_type = header & NX_LEARN_SRC_MASK; + int dst_type = header & NX_LEARN_DST_MASK; + uint64_t value; + + struct nx_action_reg_load *load; + ovs_be32 dst_field; + int dst_ofs; + + if (!header) { + break; + } + + if (src_type == NX_LEARN_SRC_FIELD) { + ovs_be32 src_field = get_be32(&p); + int src_ofs = ntohs(get_be16(&p)); + + value = nxm_read_field_bits(src_field, + nxm_encode_ofs_nbits(src_ofs, n_bits), + flow); + } else { + value = get_bits(n_bits, &p); + } + + switch (dst_type) { + case NX_LEARN_DST_MATCH: + dst_field = get_be32(&p); + dst_ofs = ntohs(get_be16(&p)); + mf_set_subfield(nxm_field_to_mf_field(ntohl(dst_field)), value, + dst_ofs, n_bits, &fm->cr); + break; + + case NX_LEARN_DST_LOAD: + dst_field = get_be32(&p); + dst_ofs = ntohs(get_be16(&p)); + load = ofputil_put_NXAST_REG_LOAD(&actions); + load->ofs_nbits = nxm_encode_ofs_nbits(dst_ofs, n_bits); + load->dst = dst_field; + load->value = htonll(value); + break; + + case NX_LEARN_DST_OUTPUT: + ofputil_put_OFPAT_OUTPUT(&actions)->port = htons(value); + break; + } + } + + fm->actions = ofpbuf_steal_data(&actions); + fm->n_actions = actions.size / sizeof(struct ofp_action_header); +} + +static void +put_be16(struct ofpbuf *b, ovs_be16 x) +{ + ofpbuf_put(b, &x, sizeof x); +} + +static void +put_be32(struct ofpbuf *b, ovs_be32 x) +{ + ofpbuf_put(b, &x, sizeof x); +} + +static void +put_u16(struct ofpbuf *b, uint16_t x) +{ + put_be16(b, htons(x)); +} + +static void +put_u32(struct ofpbuf *b, uint32_t x) +{ + put_be32(b, htonl(x)); +} + +void +learn_parse(struct ofpbuf *b, char *arg) +{ + char *orig = xstrdup(arg); + char *name, *value; + size_t learn_ofs; + size_t len; + + struct nx_action_learn *learn; + struct cls_rule rule; + + learn_ofs = b->size; + learn = ofputil_put_NXAST_LEARN(b); + learn->idle_timeout = htons(OFP_FLOW_PERMANENT); + learn->hard_timeout = htons(OFP_FLOW_PERMANENT); + learn->priority = htons(OFP_DEFAULT_PRIORITY); + learn->flags = htons(0); + learn->table_id = 1; + + cls_rule_init_catchall(&rule, 0); + while (ofputil_parse_key_value(&arg, &name, &value)) { + learn = ofpbuf_at_assert(b, learn_ofs, sizeof *learn); + if (!strcmp(name, "table")) { + learn->table_id = atoi(value); + if (learn->table_id == 255) { + ovs_fatal(0, "%s: table id 255 not valid for `learn' action", + orig); + } + } else if (!strcmp(name, "priority")) { + learn->priority = htons(atoi(value)); + } else if (!strcmp(name, "idle_timeout")) { + learn->idle_timeout = htons(atoi(value)); + } else if (!strcmp(name, "hard_timeout")) { + learn->hard_timeout = htons(atoi(value)); + } else if (mf_from_name(name)) { + const struct mf_field *mf = mf_from_name(name); + uint8_t value_bits[MF_MAXSIZE]; + uint8_t mask[MF_MAXSIZE]; + char *error; + + /* Parse it, check prerequisites, and set it in 'rule'. */ + if (!mf_are_prereqs_ok(mf, &rule.flow)) { + ovs_fatal(0, "%s: cannot specify field %s because " + "prerequisites are not satisfied", orig, name); + } + error = mf_parse(mf, value, value_bits, mask); + if (error) { + ovs_fatal(0, "%s", error); + } + mf_set(mf, value_bits, mask, &rule); + + /* Add it to the action. */ + put_u16(b, (NX_LEARN_SRC_IMMEDIATE | NX_LEARN_DST_MATCH | + mf->n_bits)); + if (mf->n_bytes % 2) { + ofpbuf_put_zeros(b, 1); + } + ofpbuf_put(b, value_bits, mf->n_bytes); + put_u32(b, mf->nxm_header); + put_u16(b, 0); /* offset */ + } else if (strchr(name, '[')) { + uint32_t src_header, dst_header; + int src_ofs, dst_ofs; + int n_bits; + + if (nxm_parse_field_bits(name, &dst_header, &dst_ofs, + &n_bits)[0] != '\0') { + ovs_fatal(0, "%s: syntax error after NXM field name `%s'", + orig, name); + } + if (value[0] != '\0') { + int src_nbits; + + if (nxm_parse_field_bits(value, &src_header, &src_ofs, + &src_nbits)[0] != '\0') { + ovs_fatal(0, "%s: syntax error after NXM field name `%s'", + orig, value); + } + if (src_nbits != n_bits) { + ovs_fatal(0, "%s: bit widths of %s (%d) and %s (%d) " + "differ", + orig, name, dst_header, value, dst_header); + } + } else { + src_header = dst_header; + src_ofs = dst_ofs; + } + + put_u16(b, NX_LEARN_SRC_FIELD | NX_LEARN_DST_MATCH | n_bits); + put_u32(b, src_header); + put_u16(b, src_ofs); + put_u32(b, dst_header); + put_u16(b, dst_ofs); + } else if (!strcmp(name, "load")) { + /* XXX check output prereqs? */ + if (value[strcspn(value, "[-")] == '-') { + struct nx_action_reg_load load; + int nbits; + int ofs; + int i; + + nxm_parse_reg_load(&load, value); + nbits = nxm_decode_n_bits(load.ofs_nbits); + ofs = nxm_decode_ofs(load.ofs_nbits); + put_u16(b, NX_LEARN_SRC_IMMEDIATE | NX_LEARN_DST_LOAD | nbits); + for (i = DIV_ROUND_UP(nbits, 16); i-- > 0; ) { + put_u16(b, ntohll(load.value) >> (i * 16)); + } + put_be32(b, load.dst); + put_u16(b, ofs); + } else { + struct nx_action_reg_move move; + + nxm_parse_reg_move(&move, value); + put_u16(b, (NX_LEARN_SRC_FIELD | NX_LEARN_DST_LOAD + | ntohs(move.n_bits))); + put_be32(b, move.src); + put_be16(b, move.src_ofs); + put_be32(b, move.dst); + put_be16(b, move.dst_ofs); + } + } else if (!strcmp(name, "output")) { + uint32_t header; + int ofs, n_bits; + + if (nxm_parse_field_bits(value, &header, &ofs, &n_bits)[0] + != '\0') { + ovs_fatal(0, "%s: syntax error after NXM field name `%s'", + orig, name); + } + + put_u16(b, NX_LEARN_SRC_FIELD | NX_LEARN_DST_OUTPUT | n_bits); + put_u32(b, header); + put_u16(b, ofs); + } else { + ovs_fatal(0, "%s: unknown keyword %s", orig, name); + } + } + free(orig); + + put_u16(b, 0); + + len = b->size - learn_ofs; + if (len % 8) { + ofpbuf_put_zeros(b, 8 - len % 8); + } + + learn = ofpbuf_at_assert(b, learn_ofs, sizeof *learn); + learn->len = htons(b->size - learn_ofs); +} + +void +learn_format(const struct nx_action_learn *learn, struct ds *s) +{ + struct cls_rule rule; + const void *p, *end; + + cls_rule_init_catchall(&rule, 0); + + ds_put_format(s, "learn(table=%"PRIu8, learn->table_id); + if (learn->idle_timeout != htons(OFP_FLOW_PERMANENT)) { + ds_put_format(s, ",idle_timeout=%"PRIu16, ntohs(learn->idle_timeout)); + } + if (learn->hard_timeout != htons(OFP_FLOW_PERMANENT)) { + ds_put_format(s, ",hard_timeout=%"PRIu16, ntohs(learn->hard_timeout)); + } + if (learn->priority != htons(OFP_DEFAULT_PRIORITY)) { + ds_put_format(s, ",priority=%"PRIu16, ntohs(learn->priority)); + } + if (learn->flags & htons(OFPFF_SEND_FLOW_REM)) { + ds_put_cstr(s, ",OFPFF_SEND_FLOW_REM"); + } + if (learn->flags & htons(~OFPFF_SEND_FLOW_REM)) { + ds_put_format(s, ",***flags=%"PRIu16"***", + ntohs(learn->flags) & ~OFPFF_SEND_FLOW_REM); + } + if (!is_all_zeros(learn->pad, sizeof learn->pad)) { + ds_put_cstr(s, ",***nonzero pad***"); + } + + end = (char *) learn + ntohs(learn->len); + for (p = learn + 1; p != end; ) { + uint16_t header = ntohs(get_be16(&p)); + int n_bits = header & NX_LEARN_N_BITS_MASK; + + int src_type = header & NX_LEARN_SRC_MASK; + uint32_t src_header; + int src_ofs; + const uint8_t *src_value; + int src_value_bytes; + const struct mf_field *src_field; + + int dst_type = header & NX_LEARN_DST_MASK; + uint32_t dst_header; + int dst_ofs; + const struct mf_field *dst_field; + + int min_len; + int i; + + if (!header) { + break; + } + + /* Check for valid src and dst type combination. */ + if (dst_type == NX_LEARN_DST_MATCH || + dst_type == NX_LEARN_DST_LOAD || + (dst_type == NX_LEARN_DST_OUTPUT && + src_type == NX_LEARN_SRC_FIELD)) { + /* OK. */ + } else { + ds_put_format(s, ",***bad flow_mod_spec header %"PRIx16"***)", + header); + return; + } + + /* Check that the arguments don't overrun the end of the action. */ + min_len = 0; + if (src_type == NX_LEARN_SRC_FIELD) { + min_len += sizeof(ovs_be32); /* src_header */ + min_len += sizeof(ovs_be16); /* src_ofs */ + } else { + min_len += 2 * DIV_ROUND_UP(n_bits, 16); + } + if (dst_type == NX_LEARN_DST_MATCH || + dst_type == NX_LEARN_DST_LOAD) { + min_len += sizeof(ovs_be32); /* dst_header */ + min_len += sizeof(ovs_be16); /* dst_ofs */ + } + if ((char *) end - (char *) p < min_len) { + ds_put_format(s, ",***flow_mod_spec at offset %td is %d bytes " + "long but only %td bytes are left***)", + (char *) p - (char *) (learn + 1) - 2, min_len + 2, + (char *) end - (char *) p + 2); + return; + } + + /* Get the source. */ + if (src_type == NX_LEARN_SRC_FIELD) { + src_header = ntohl(get_be32(&p)); + src_field = nxm_field_to_mf_field(src_header); + src_ofs = ntohs(get_be16(&p)); + src_value_bytes = 0; + src_value = NULL; + } else { + src_header = 0; + src_field = NULL; + src_ofs = 0; + src_value_bytes = 2 * DIV_ROUND_UP(n_bits, 16); + src_value = p; + p = (const void *) ((const uint8_t *) p + src_value_bytes); + } + + /* Get the destination. */ + if (dst_type == NX_LEARN_DST_MATCH || dst_type == NX_LEARN_DST_LOAD) { + dst_header = ntohl(get_be32(&p)); + dst_field = nxm_field_to_mf_field(dst_header); + dst_ofs = ntohs(get_be16(&p)); + } else { + dst_header = 0; + dst_field = NULL; + dst_ofs = 0; + } + + ds_put_char(s, ','); + + switch (src_type | dst_type) { + case NX_LEARN_SRC_IMMEDIATE | NX_LEARN_DST_MATCH: + if (dst_field && dst_ofs == 0 && n_bits == dst_field->n_bits) { + uint8_t value[MF_MAXSIZE]; + + memset(value, 0, sizeof value); + memcpy(&value[dst_field->n_bytes - src_value_bytes], + src_value, src_value_bytes); + ds_put_format(s, "%s=", dst_field->name); + mf_format(dst_field, value, NULL, s); + } else { + nxm_format_field_bits(s, dst_header, dst_ofs, n_bits); + ds_put_cstr(s, "=0x"); + for (i = 0; i < src_value_bytes; i++) { + ds_put_format(s, "%02"PRIx8, src_value[i]); + } + } + break; + + case NX_LEARN_SRC_FIELD | NX_LEARN_DST_MATCH: + nxm_format_field_bits(s, dst_header, dst_ofs, n_bits); + if (src_header != dst_header || src_ofs != dst_ofs) { + ds_put_char(s, '='); + nxm_format_field_bits(s, src_header, src_ofs, n_bits); + } + break; + + case NX_LEARN_SRC_IMMEDIATE | NX_LEARN_DST_LOAD: + ds_put_cstr(s, "load:0x"); + for (i = 0; i < src_value_bytes; i++) { + ds_put_format(s, "%02"PRIx8, src_value[i]); + } + ds_put_cstr(s, "->"); + nxm_format_field_bits(s, dst_header, dst_ofs, n_bits); + break; + + case NX_LEARN_SRC_FIELD | NX_LEARN_DST_LOAD: + nxm_format_field_bits(s, src_header, src_ofs, n_bits); + ds_put_cstr(s, "->"); + nxm_format_field_bits(s, dst_header, dst_ofs, n_bits); + break; + + case NX_LEARN_SRC_FIELD | NX_LEARN_DST_OUTPUT: + ds_put_cstr(s, "output:"); + nxm_format_field_bits(s, src_header, src_ofs, n_bits); + break; + } + } + if (!is_all_zeros(p, (char *) end - (char *) p)) { + ds_put_cstr(s, ",***nonzero trailer***"); + } + ds_put_char(s, ')'); +} diff --git a/lib/learn.h b/lib/learn.h new file mode 100644 index 0000000..85f275f --- /dev/null +++ b/lib/learn.h @@ -0,0 +1,40 @@ +/* + * Copyright (c) 2011 Nicira Networks. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef LEARN_H +#define LEARN_H 1 + +#include <stdint.h> + +struct ds; +struct flow; +struct ofpbuf; +struct ofputil_flow_mod; +struct nx_action_learn; + +/* NXAST_LEARN helper functions. + * + * See include/openflow/nicira-ext.h for NXAST_LEARN specification. + */ + +int learn_check(const struct nx_action_learn *, const struct flow *); +void learn_execute(const struct nx_action_learn *, const struct flow *, + struct ofputil_flow_mod *); + +void learn_parse(struct ofpbuf *, char *); +void learn_format(const struct nx_action_learn *, struct ds *); + +#endif /* learn.h */ diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c index 115fd48..964a387 100644 --- a/lib/ofp-parse.c +++ b/lib/ofp-parse.c @@ -26,6 +26,7 @@ #include "bundle.h" #include "byte-order.h" #include "dynamic-string.h" +#include "learn.h" #include "meta-flow.h" #include "netdev.h" #include "multipath.h" @@ -332,6 +333,10 @@ parse_named_action(enum ofputil_action_code code, struct ofpbuf *b, char *arg) case OFPUTIL_NXAST_RESUBMIT_TABLE: case OFPUTIL_NXAST_OUTPUT_REG: NOT_REACHED(); + + case OFPUTIL_NXAST_LEARN: + learn_parse(b, arg); + break; } } diff --git a/lib/ofp-print.c b/lib/ofp-print.c index 3b9c582..2311092 100644 --- a/lib/ofp-print.c +++ b/lib/ofp-print.c @@ -31,6 +31,7 @@ #include "compiler.h" #include "dynamic-string.h" #include "flow.h" +#include "learn.h" #include "multipath.h" #include "nx-match.h" #include "ofp-util.h" @@ -333,6 +334,10 @@ ofp_print_action(struct ds *s, const union ofp_action *a, nxm_decode_n_bits(naor->ofs_nbits)); break; + case OFPUTIL_NXAST_LEARN: + learn_format((const struct nx_action_learn *) a, s); + break; + default: break; } diff --git a/lib/ofp-util.c b/lib/ofp-util.c index a97a0e3..6705ae4 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -16,6 +16,7 @@ #include <config.h> #include "ofp-print.h" +#include <ctype.h> #include <errno.h> #include <inttypes.h> #include <netinet/icmp6.h> @@ -25,6 +26,7 @@ #include "byte-order.h" #include "classifier.h" #include "dynamic-string.h" +#include "learn.h" #include "multipath.h" #include "nx-match.h" #include "ofp-errors.h" @@ -2132,6 +2134,10 @@ validate_actions(const union ofp_action *actions, size_t n_actions, (const struct nx_action_resubmit *) a); break; + case OFPUTIL_NXAST_LEARN: + error = learn_check((const struct nx_action_learn *) a, flow); + break; + case OFPUTIL_OFPAT_STRIP_VLAN: case OFPUTIL_OFPAT_SET_NW_SRC: case OFPUTIL_OFPAT_SET_NW_DST: diff --git a/lib/ofp-util.def b/lib/ofp-util.def index c5d883d..7868faa 100644 --- a/lib/ofp-util.def +++ b/lib/ofp-util.def @@ -34,4 +34,5 @@ NXAST_ACTION(NXAST_BUNDLE, nx_action_bundle, 1, "bundle") NXAST_ACTION(NXAST_BUNDLE_LOAD, nx_action_bundle, 1, "bundle_load") NXAST_ACTION(NXAST_RESUBMIT_TABLE, nx_action_resubmit, 0, NULL) NXAST_ACTION(NXAST_OUTPUT_REG, nx_action_output_reg, 0, NULL) +NXAST_ACTION(NXAST_LEARN, nx_action_learn, 1, "learn") #undef NXAST_ACTION diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 3fd95ea..16d3ad2 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -32,6 +32,7 @@ #include "fail-open.h" #include "hmapx.h" #include "lacp.h" +#include "learn.h" #include "mac-learning.h" #include "multipath.h" #include "netdev.h" @@ -3146,6 +3147,25 @@ slave_enabled_cb(uint16_t ofp_port, void *ofproto_) } static void +xlate_learn_action(struct action_xlate_ctx *ctx, + const struct nx_action_learn *learn) +{ + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); + struct ofputil_flow_mod fm; + int error; + + learn_execute(learn, &ctx->flow, &fm); + error = ofproto_flow_mod(&ctx->ofproto->up, &fm); + if (error && !VLOG_DROP_WARN(&rl)) { + char *msg = ofputil_error_to_string(error); + VLOG_WARN("learning action failed to modify flow table (%s)", msg); + free(msg); + } + + free(fm.actions); +} + +static void do_xlate_actions(const union ofp_action *in, size_t n_in, struct action_xlate_ctx *ctx) { @@ -3302,6 +3322,12 @@ do_xlate_actions(const union ofp_action *in, size_t n_in, naor = (const struct nx_action_output_reg *) ia; xlate_output_reg_action(ctx, naor); break; + + case OFPUTIL_NXAST_LEARN: + if (ctx->packet) { + xlate_learn_action(ctx, (const struct nx_action_learn *) ia); + } + break; } } } diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index 9c53657..0b68d34 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -25,6 +25,8 @@ #include "shash.h" #include "timeval.h" +struct ofputil_flow_mod; + /* An OpenFlow switch. * * With few exceptions, ofproto implementations may look at these fields but @@ -940,6 +942,18 @@ extern const struct ofproto_class ofproto_dpif_class; int ofproto_class_register(const struct ofproto_class *); int ofproto_class_unregister(const struct ofproto_class *); +/* ofproto_flow_mod() returns this value if the flow_mod could not be processed + * because it overlaps with an ongoing flow table operation that has not yet + * completed. The caller should retry the operation later. + * + * ofproto.c also uses this value internally for additional (similar) purposes. + * + * This particular value is a good choice because it is negative (so it won't + * collide with any errno value or any value returned by ofp_mkerr()) and large + * (so it won't accidentally collide with EOF or a negative errno value). */ +enum { OFPROTO_POSTPONE = -100000 }; + +int ofproto_flow_mod(struct ofproto *, const struct ofputil_flow_mod *); void ofproto_add_flow(struct ofproto *, const struct cls_rule *, const union ofp_action *, size_t n_actions); bool ofproto_delete_flow(struct ofproto *, const struct cls_rule *); diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index d35e1d8..b491b91 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -139,16 +139,10 @@ static int add_flow(struct ofproto *, struct ofconn *, const struct ofputil_flow_mod *, const struct ofp_header *); -/* This return value tells handle_openflow() that processing of the current - * OpenFlow message must be postponed until some ongoing operations have - * completed. - * - * This particular value is a good choice because it is negative (so it won't - * collide with any errno value or any value returned by ofp_mkerr()) and large - * (so it won't accidentally collide with EOF or a negative errno value). */ -enum { OFPROTO_POSTPONE = -100000 }; - static bool handle_openflow(struct ofconn *, struct ofpbuf *); +static int handle_flow_mod__(struct ofproto *, struct ofconn *, + const struct ofputil_flow_mod *, + const struct ofp_header *); static void update_port(struct ofproto *, const char *devname); static int init_ports(struct ofproto *); @@ -1062,6 +1056,18 @@ ofproto_add_flow(struct ofproto *ofproto, const struct cls_rule *cls_rule, } } +/* Executes the flow modification specified in 'fm'. Returns 0 on success, an + * OpenFlow error code as encoded by ofp_mkerr() on failure, or + * OFPROTO_POSTPONE if the operation cannot be initiated now but may be retried + * later. + * + * This is a helper function for in-band control and fail-open. */ +int +ofproto_flow_mod(struct ofproto *ofproto, const struct ofputil_flow_mod *fm) +{ + return handle_flow_mod__(ofproto, NULL, fm, NULL); +} + /* Searches for a rule with matching criteria exactly equal to 'target' in * ofproto's table 0 and, if it finds one, deletes it. * @@ -2171,8 +2177,9 @@ is_flow_deletion_pending(const struct ofproto *ofproto, * in which no matching flow already exists in the flow table. * * Adds the flow specified by 'ofm', which is followed by 'n_actions' - * ofp_actions, to the ofproto's flow table. Returns 0 on success or an - * OpenFlow error code as encoded by ofp_mkerr() on failure. + * ofp_actions, to the ofproto's flow table. Returns 0 on success, an OpenFlow + * error code as encoded by ofp_mkerr() on failure, or OFPROTO_POSTPONE if the + * operation cannot be initiated now but may be retried later. * * 'ofconn' is used to retrieve the packet buffer specified in ofm->buffer_id, * if any. */ @@ -2451,7 +2458,6 @@ ofproto_rule_expire(struct rule *rule, uint8_t reason) static int handle_flow_mod(struct ofconn *ofconn, const struct ofp_header *oh) { - struct ofproto *ofproto = ofconn_get_ofproto(ofconn); struct ofputil_flow_mod fm; int error; @@ -2460,11 +2466,6 @@ handle_flow_mod(struct ofconn *ofconn, const struct ofp_header *oh) return error; } - if (ofproto->n_pending >= 50) { - assert(!list_is_empty(&ofproto->pending)); - return OFPROTO_POSTPONE; - } - error = ofputil_decode_flow_mod(&fm, oh, ofconn_get_flow_mod_table_id(ofconn)); if (error) { @@ -2479,24 +2480,37 @@ handle_flow_mod(struct ofconn *ofconn, const struct ofp_header *oh) return ofp_mkerr(OFPET_FLOW_MOD_FAILED, OFPFMFC_ALL_TABLES_FULL); } - switch (fm.command) { + return handle_flow_mod__(ofconn_get_ofproto(ofconn), ofconn, &fm, oh); +} + +static int +handle_flow_mod__(struct ofproto *ofproto, struct ofconn *ofconn, + const struct ofputil_flow_mod *fm, + const struct ofp_header *oh) +{ + if (ofproto->n_pending >= 50) { + assert(!list_is_empty(&ofproto->pending)); + return OFPROTO_POSTPONE; + } + + switch (fm->command) { case OFPFC_ADD: - return add_flow(ofproto, ofconn, &fm, oh); + return add_flow(ofproto, ofconn, fm, oh); case OFPFC_MODIFY: - return modify_flows_loose(ofproto, ofconn, &fm, oh); + return modify_flows_loose(ofproto, ofconn, fm, oh); case OFPFC_MODIFY_STRICT: - return modify_flow_strict(ofproto, ofconn, &fm, oh); + return modify_flow_strict(ofproto, ofconn, fm, oh); case OFPFC_DELETE: - return delete_flows_loose(ofproto, ofconn, &fm, oh); + return delete_flows_loose(ofproto, ofconn, fm, oh); case OFPFC_DELETE_STRICT: - return delete_flow_strict(ofproto, ofconn, &fm, oh); + return delete_flow_strict(ofproto, ofconn, fm, oh); default: - if (fm.command > 0xff) { + if (fm->command > 0xff) { VLOG_WARN_RL(&rl, "flow_mod has explicit table_id but " "flow_mod_table_id extension is not enabled"); } diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in index 7ce1ea9..6f74e10 100644 --- a/utilities/ovs-ofctl.8.in +++ b/utilities/ovs-ofctl.8.in @@ -775,6 +775,63 @@ using the Highest Random Weight algorithm, and writes the selection to \fBNXM_NX_REG0[]\fR. .IP Refer to \fBnicira\-ext.h\fR for more details. +. +.IP "\fBlearn(\fIargument\fR[\fB,\fIargument\fR]...\fB)\fR" +This action adds or modifies a flow in an OpenFlow table, similar to +\fBovs\-ofctl \-\-strict mod\-flows\fR. The arguments specify the +flow's match fields, actions, and other properties, as follows. At +least one match criterion and one action argument should ordinarily be +specified. +.RS +.IP \fBidle_timeout=\fIseconds\fR +.IQ \fBhard_timeout=\fIseconds\fR +.IQ \fBpriority=\fIvalue\fR +.IQ \fBtable=\fInumber\fR +These key-value pairs have the same meaning as in the usual +\fBovs\-ofctl\fR flow syntax. +. +.IP \fIfield\fB=\fIvalue\fR +.IQ \fIfield\fB[\fIstart\fB..\fIend\fB]=\fIsrc\fB[\fIstart\fB..\fIend\fB]\fR +.IQ \fIfield\fB[\fIstart\fB..\fIend\fB]\fR +Adds a match criterion to the new flow. +.IP +The first form specifies that \fIfield\fR must match the literal +\fIvalue\fR, e.g. \fBdl_type=0x0800\fR. All of the fields and values +for \fBovs\-ofctl\fR flow syntax are available with their usual +meanings. +.IP +The second form specifies that \fIfield\fB[\fIstart\fB..\fIend\fB]\fR +in the new flow must match \fIsrc\fB[\fIstart\fB..\fIend\fB]\fR taken +from the flow currently being processed. +.IP +The third form is a shorthand for the second form. It specifies that +\fIfield\fB[\fIstart\fB..\fIend\fB]\fR in the new flow must match +\fIfield\fB[\fIstart\fB..\fIend\fB]\fR taken from the flow currently +being processed. +. +.IP \fBload:\fIvalue\fB\->\fIdst\fB[\fIstart\fB..\fIend\fB] +.IQ \fBload:\fIsrc\fB[\fIstart\fB..\fIend\fB]\->\fIdst\fB[\fIstart\fB..\fIend\fB] +. +Adds a \fBload\fR action to the new flow. +.IP +The first form loads the literal \fIvalue\fR into bits \fIstart\fR +through \fIend\fR, inclusive, in field \fIdst\fR. Its syntax is the +same as the \fBload\fR action described earlier in this section. +.IP +The second form loads \fIsrc\fB[\fIstart\fB..\fIend\fB]\fR, a value +from the flow currently being processed, into bits \fIstart\fR +through \fIend\fR, inclusive, in field \fIdst\fR. +. +.IP \fBoutput:\fIfield\fB[\fIstart\fB..\fIend\fB]\fR +Add an \fBoutput\fR action to the new flow's actions, that outputs to +the OpenFlow port taken from \fIfield\fB[\fIstart\fB..\fIend\fB]\fR, +which must be an NXM field as described above. +.RE +.IP +For best performance, segregate learned flows into a table (using +\fBtable=\fInumber\fR) that is not used for any other flows except +possibly for a lowest-priority ``catch-all'' flow (a flow with no +match criteria). .RE . .PP -- 1.7.4.4 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev