On Jun 6, 2013, at 2:02 PM, Ben Pfaff <b...@nicira.com> wrote: > Every caller of lookup_xc() creates a new xc entry on failure. Can > any of this be factored into a helper? If not, then all the callers > of create_xc() use very similar 3 lines of code, so can that be > integrated into create_xc()?
I think the interface would be too weird for lookup_xc(), since you have to provide all this creation arguments, which don't seem like they belong in a lookup interface. My rewrite of handle_flow_miss_without_facet() in v2 of this patch makes those three lines a little less consistent. However, let me know in your review of v2 if you think it would still be useful to factor out. > A struct xout_cache is pretty big and the initial xzalloc() is > followed pretty quickly by copying into its '->xout->wc' and then (in > insert_xc()) into '->flow'. Can we save time by initializing only > what we need? (Most notably we never need to zero the 256 bytes in > xc->xout.odp_actions_stub.) Fair enough. Done. > handle_flow_miss_without_facet() previously translated the actions for > each packet. Now, it translates the actions once (or not at all, if > there's an xc already). Won't this prevent slow protocols (e.g. CFM, > LACP, BFD) from working properly, because packets after the first > won't go through translation? I guess the unit tests pass; maybe I am > missing something (or maybe the unit tests pass because they never > trigger the governor). You are correct. (And, no, the unit tests must not trigger the governor.) My next patch should address that. > Both forks of the main "if" in handle_flow_miss_without_facet() > contain the same xlate_out_copy() call. I think it can be factored > out. Actually I think it can be moved into the body of "if > (op->xout.odp_actions.size) {", and then we don't need the > xlate_out_uninit() in the "else" case. > > In handle_flow_miss_without_facet(), is it necessary to do the > lookup_xc() inside the LIST_FOR_EACH loop? That is, can we move it > above the loop and then just use 'xc' for the whole body of the loop? Both of these should be addressed in my refactoring of handle_flow_miss_without_facet(). > All the callers of rule_dpif_lookup() that provide a nonnull 'wc' call > flow_wildcards_init_catchall() on that 'wc' immediately beforehand. > Should rule_dpif_lookup() do it internally? I think this would make things more confusing because we couldn't initialize 'wc' in rule_dpif_lookup__(), since it is called recursively. > I see opportunity for optimization in facet_create(). I don't know > whether it would have any effect on real performance. Anyway: the > 'wc' is only used if a new 'xc' must be created, and we can figure > that out based on just miss->flow. So we could theoretically save > time by passing NULL to rule_dpif_lookup() if there's an existing > 'xc'. I think that makes the code harder to read, and I don't think keeping track of 'wc' is that expensive. As we discussed off-line, we can clean this up later if it shows up in profiling. > The commit changes facet_create()'s comment to mention a 'wc' > parameter, but facet_create() doesn't have a 'wc' parameter. Fixed. > facet_lookup_valid() now appears to always iterate over every > xout_cache entry (!). Partly I think that's just a mistake (it should > not call invalidate_xc_tags() if the revalidate_set is empty, or > invalidate_xc_tags() should do nothing if its revalidate_set parameter > is empty). But, why does it need to do that at all? That is, why > can't it just see whether 'facet->xc->xout.tags' intersects the > revalidate_set, at least as an initial check? You're right. I put a guard in invalidate_xc_tags() and fixed facet_lookup_valid(). > New code in facet_revalidate() looks like this: > > xc = lookup_xc(ofproto, &facet->flow); > if (!xc) { > ovs_assert(!facet->xc); > xlate_in_init(&xin, ofproto, &facet->flow, &facet->initial_vals, > new_rule, 0, NULL); > xc = create_xc(&wc, &facet->initial_vals, facet); > xlate_actions(&xin, &xc->xout); > insert_xc(ofproto, xc, &facet->flow); > refresh_subfacets = true; > } else if (!facet->xc) { > add_facet_to_xc(xc, facet); > refresh_subfacets = true; > } > > I am not sure about the "fall through" case at the end. I have a > suspicion that, in this case, we always have xc == facet->xc, because > I think that if the xc might have changed, then the caller would have > destroyed that cache entry. If that is correct, then we can delete a > lot of code from facet_revalidate() that only fires if xc != > facet->xc. But I am not entirely convinced that is correct. As we discussed off-line, I don't think this was true of the original code, since rule_destruct() didn't clear anything, but we agree that was a bug. > I am not sure that revalidation does the right thing if only the slow > path of a translation changes. > > This new code will do a lot of unneeded kernel datapath flow > reinstallations. For example, if need_revalidate becomes true, then > it will reinstall every flow in the kernel datapath, even if their ODP > actions do not change. This becomes extra-expensive because it does > the reinstallations one-by-one instead of batching them up. (The > current code also does reinstallations one by one, but it has not been > a problem because it only reinstalls flows whose actions change, which > is rare.) I want to look at facet_revalidate() fresher eyes in the morning. I'm going to send out a v2 of this patch knowing that there will be a v3 with at least the facet_revalidate refactoring. > In xout_cache_push_stats(), in the call to memset(), can we just write > 0 instead of '\0'? The latter looks like a string null terminator, > which seems weird here. Done. > I don't understand this comment in facet_push_stats__(): > /* Use local stats instead of 'facet->xc->stats' because this > * function may be called multiple times before > * xout_cache_push_stats(), which leads to high counts. */ Okay. I rewrote it. Let me know if it's still not clear. > The split between the OFPACT_LEARN case in do_xlate_actions() and the > implementation in xlate_learn_action() seems more awkward than before. > Can we move the assignment to has_learn into xlate_learn_action()? Done. > I think that OFPACT_SET_IPV4_SRC, OFPACT_SET_IPV4_DST, > OFPACT_SET_IPV4_DSCP, OFPACT_SET_L4_SRC_PORT, OFPACT_SET_L4_DST_PORT, > and OFPACT_DEC_TTL need to add dependencies to 'wc', because their > behavior differs based on the protocol. For example, > OFPACT_SET_IPV4_SRC is a no-op if the Ethertype is anything other than > IPv4, so I would think it would need to mark the Ethertype as a > dependency during translation. I think I got them all, but please take another look. > I think that OFPACT_OUTPUT_REG needs to depend on the field it reads. Done. > I think that OFPACT_POP_QUEUE needs to mark the skb_priority as a > dependency. Done. > Did you consider dependencies for the MPLS actions? Whoops. Fixed. > destroy_xc() contains a doubled semicolon ";;" at the end of a > statement. Fixed. Thanks for the review. I'll send out v2 in a moment. --Justin _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev