On Sun, Dec 15, 2013 at 05:57:57PM -0800, Ethan Jackson wrote: > This patch moves flow installation and eviction from ofproto-dpif and > the main thread, into ofproto-dpif-upcall. This performs > significantly better (approximately 2x TCP_CRR improvement), and > allows ovs-vswitchd to maintain significantly larger datapath flow > tables. On top of that, it significantly simplifies the code, > retiring "strut facet" and friends.
"struct" > Signed-off-by: Ethan Jackson <et...@nicira.com> This is good work! I especially like how this is a net removal of code. I would mention in NEWS that flow-eviction-threshold has gone away. You might want to mention that there is a new flow-limit parameter (but if you do please also recommend reading the documentation since the meaning is different). ofproto-dpif-upcall.h now has two prototypes for udpif_get_memory_usage(). The definition of FLOW_MISS_MAX_BATCH could be moved into ofproto-dpif-upcall.c. I think that the remaining comment in ofproto-dpif-upcall.h is wrong: I don't think it hands anything up to the main module anymore. I wasn't sure what this comment on rule_dpif and group_dpif meant, maybe you could rephrase it: * - Do include packets and bytes from datapath flows which have not been * pushed by ofproto-dpif-upcall. */ Eww: + /* We know stats are relatively fresh since a dump just happened, so Why does udpif_set_threads() destroy each revalidator thread's ukeys? Can't the thread itself do that before it exits?xs This might be premature optimization, but did you consider putting 'key_buf' at the end of struct udpif_key so that all of the frequently used data is in the first cache line? Ditto for 'key_buf' and 'mask_buf' in struct udpif_flow_dump. I think that the 'op' member of struct udpif_flow_dump is only used inside revalidate_udumps(). It adds over 40 bytes to each instance. Maybe it doesn't matter because there's a limited number of instances in flight at a given time? But otherwise we could easily allocate them on the stack in revalidate_udumps(). 'key' in udpif_key, and 'key' and 'mask' in udpif_flow_dump, seem to be unneeded: they always point to the beginning of the corresponding _buf. Maybe they are convenient enough to keep, though. udpif_get_n_flows() has a mysteriously 'static' local variable. In handle_upcalls(), I like to do this sort of comment, where we're actually saying something about the alternative case: if (miss->upcall_type == DPIF_UC_MISS) { /* For non-miss upcalls, there's a flow in the datapath which this * packet was accounted to. Presumably the revalidators will deal * with pushing its stats eventually. */ xin.resubmit_stats = &miss->stats; } as: if (miss->upcall_type == DPIF_UC_MISS) { xin.resubmit_stats = &miss->stats; } else { /* There's a flow in the datapath which this packet was accounted * to. The revalidators will push its stats eventually. */ } but I'll understand if you think that's weird. A struct odputil_keybuf is kind of big because of the odputil_keybuf member, but revalidate_udumps() uses xzalloc() to allocate one. Can we use xmalloc() instead and then just zero the members we must? Could you document n-revalidator-threads in vswitch.xml? I have a few more stylistic and other nonessential comments. I shall present them in the form of an interpretive dance expressing my joy at the deletion of code. I mean, I shall present them as an incremental patch. diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index a74015f..a7f6dc3 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -31,6 +31,7 @@ #include "ofpbuf.h" #include "ofproto-dpif-ipfix.h" #include "ofproto-dpif-sflow.h" +#include "ofproto-dpif-xlate.h" #include "packets.h" #include "poll-loop.h" #include "seq.h" @@ -38,6 +39,7 @@ #include "vlog.h" #define MAX_QUEUE_LENGTH 512 +#define FLOW_MISS_MAX_BATCH 50 #define REVALIDATE_MAX_BATCH 50 VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall); @@ -45,8 +47,8 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall); COVERAGE_DEFINE(upcall_queue_overflow); /* A thread that processes each upcall handed to it by the dispatcher thread, - * forwards the upcall's packet, and then queues it to the main ofproto_dpif - * to possibly set up a kernel flow as a cache. */ + * forwards the upcall's packet, and possibly sets up a kernel flow as a + * cache. */ struct handler { struct udpif *udpif; /* Parent udpif. */ pthread_t thread; /* Thread ID. */ @@ -64,6 +66,9 @@ struct handler { 'mutex'. */ }; +/* A thread that processes each kernel flow handed to it by the flow_dumper + * thread, updates OpenFlow statistics, and updates or removes the kernel flow + * as necessary. */ struct revalidator { struct udpif *udpif; /* Parent udpif. */ char *name; /* Thread name. */ @@ -81,11 +86,14 @@ struct revalidator { /* An upcall handler for ofproto_dpif. * - * udpif is implemented as a "dispatcher" thread that reads upcalls from the - * kernel. It processes each upcall just enough to figure out its next - * destination. For a "miss" upcall (MISS_UPCALL), this is one of several - * "handler" threads (see struct handler). Other upcalls are queued to the - * main ofproto_dpif. */ + * udpif has two logically separate pieces: + * + * - A "dispatcher" thread that reads upcalls from the kernel and dispatches + * them to one of several "handler" threads (see struct handler). + * + * - A "flow_dumper" thread that reads the kernel flow table and dispatches + * flows to one of several "revalidator" threads (see struct + * revalidator). */ struct udpif { struct list list_node; /* In all_udpifs list. */ @@ -432,7 +440,7 @@ udpif_get_memory_usage(struct udpif *udpif, struct simap *usage) ovs_mutex_lock(&revalidator->mutex); simap_increase(usage, "revalidator dumps", revalidator->n_udumps); - /* XXX: This isn't technically thread safe becuase the revalidator + /* XXX: This isn't technically thread safe because the revalidator * ukeys maps isn't protected by a mutex since it's per thread. */ simap_increase(usage, "revalidator keys", hmap_count(&revalidator->ukeys)); @@ -891,7 +899,7 @@ handle_upcalls(struct handler *handler, struct list *upcalls) atomic_read(&udpif->flow_limit, &flow_limit); may_put = udpif_get_n_flows(udpif) < flow_limit; - /* Extract the flow from each upcall. Construct in misses a hash table + /* Extract the flow from each upcall. Construct in 'misses' a hash table * that maps each unique flow to a 'struct flow_miss'. * * Most commonly there is a single packet per flow_miss, but there are @@ -1042,9 +1050,9 @@ handle_upcalls(struct handler *handler, struct list *upcalls) xin.may_learn = true; if (miss->upcall_type == DPIF_UC_MISS) { - /* For non miss upcalls, there's a flow in the datapath which this + /* For non-miss upcalls, there's a flow in the datapath which this * packet was accounted to. Presumably the revalidators will deal - * with pushing it's stats eventually. */ + * with pushing its stats eventually. */ xin.resubmit_stats = &miss->stats; } @@ -1284,8 +1292,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump, /* Since the kernel is free to ignore wildcarded bits in the mask, we can't * directly check that the masks are the same. Instead we check that the - * mask in the kernel is more specific i.e. less wildcarded, then what - * we've calculated here. This guarnatees we don't catch any packets we + * mask in the kernel is more specific i.e. less wildcarded, than what + * we've calculated here. This guarantees we don't catch any packets we * shouldn't with the megaflow. */ udump32 = (uint32_t *) &udump_mask; xout32 = (uint32_t *) &xout.wc.masks; @@ -1297,12 +1305,8 @@ revalidate_ukey(struct udpif *udpif, struct udpif_flow_dump *udump, ok = true; exit: - if (actions) { - ofpbuf_delete(actions); - } - if (xoutp) { - xlate_out_uninit(xoutp); - } + ofpbuf_delete(actions); + xlate_out_uninit(xoutp); return ok; } diff --git a/ofproto/ofproto-dpif-upcall.h b/ofproto/ofproto-dpif-upcall.h index 9f445c2..85d4a28 100644 --- a/ofproto/ofproto-dpif-upcall.h +++ b/ofproto/ofproto-dpif-upcall.h @@ -15,23 +15,14 @@ #ifndef OFPROTO_DPIF_UPCALL_H #define OFPROTO_DPIF_UPCALL_H -#define FLOW_MISS_MAX_BATCH 50 +#include <stddef.h> -#include "dpif.h" -#include "flow.h" -#include "hmap.h" -#include "list.h" -#include "odp-util.h" -#include "ofpbuf.h" -#include "ofproto-dpif-xlate.h" - -struct seq; struct dpif; struct dpif_backer; +struct seq; +struct simap; -/* udif is responsible for retrieving upcalls from the kernel, processing miss - * upcalls, and handing more complex ones up to the main ofproto-dpif - * module. */ +/* udif retrieves upcalls from the kernel and processing them. */ struct udpif *udpif_create(struct dpif_backer *, struct dpif *); void udpif_set_threads(struct udpif *, size_t n_handlers, @@ -44,8 +35,6 @@ void udpif_get_memory_usage(struct udpif *, struct simap *usage); struct seq *udpif_dump_seq(struct udpif *); -void udpif_get_memory_usage(struct udpif *, struct simap *usage); - void udpif_flush(void); #endif /* ofproto-dpif-upcall.h */ diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index bf2d45d..f5b7428 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -213,7 +213,7 @@ struct dpif_completion { struct ofoperation *op; }; -/* Reasons that we might need to revalidate every datapath flow , and +/* Reasons that we might need to revalidate every datapath flow, and * corresponding coverage counters. * * A value of 0 means that there is no need to revalidate. diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 28a6c2b..949bbfd 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -126,7 +126,7 @@ <column name="other_config" key="flow-limit" type='{"type": "integer", "minInteger": 0}'> <p> - A number of flows as a nonnegative integer. This sets the maxmimum + The maximum number of flows allowed in the datapath flow table. Internally OVS will choose a flow limit which will likely be lower than this number, based on real time network conditions. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev