On Nov 19, 2012, at 12:16 PM, Jesse Gross <je...@nicira.com> wrote: > On Fri, Nov 16, 2012 at 2:30 PM, Kyle Mestery (kmestery) > <kmest...@cisco.com> wrote: >> On Nov 16, 2012, at 3:31 PM, Kyle Mestery (kmestery) <kmest...@cisco.com> >> wrote: >>> On Nov 16, 2012, at 3:18 PM, Jesse Gross <je...@nicira.com> wrote: >>>> On Fri, Nov 16, 2012 at 1:00 PM, Kyle Mestery (kmestery) >>>> <kmest...@cisco.com> wrote: >>>>> On Nov 15, 2012, at 3:32 PM, Kyle Mestery (kmestery) <kmest...@cisco.com> >>>>> wrote: >>>>>> On Nov 15, 2012, at 3:13 PM, Kyle Mestery (kmestery) >>>>>> <kmest...@cisco.com> wrote: >>>>>>> On Nov 15, 2012, at 1:03 PM, Kyle Mestery (kmestery) >>>>>>> <kmest...@cisco.com> wrote: >>>>>>>> Jesse: >>>>>>>> >>>>>>>> I modified the source port hashing for the VXLAN patch I submitted a >>>>>>>> few days ago, >>>>>>>> but I've noticed when using the upstream source port hashing routine, >>>>>>>> performance >>>>>>>> drops off by 3.5 times when using iperf between two VMs. From what I >>>>>>>> can tell, it >>>>>>>> has to be that all skbuffs coming into the VXLAN tunnel have not >>>>>>>> already had their >>>>>>>> rxhash set, and this function is what's killing performance. Let me >>>>>>>> share the details: >>>>>>>> >>>>>>> I think I figured this out. The upstream source port selection >>>>>>> algorithm is exploding flows >>>>>>> in the fast path. Here are iperf runs with both and subsequent >>>>>>> "ovs-dpctl dump-flows" >>>>>>> commands for comparison. The first one is with the upstream version, >>>>>>> the second is >>>>>>> with the one in my patch. Note that I just piped "ovs-dpctl dump-flows" >>>>>>> into wc to >>>>>>> summarize the flow count. >>>>>>> >>>>>>> Upstream verison: >>>>>>> [root@linux-br ~]# iperf -c 10.1.2.14 && ovs-dpctl dump-flows | wc >>>>>> >>>>>> >>>>>> Figured this out, fixing it now, will repost the patch with only this >>>>>> change soon. >>>>>> >>>>> >>>>> So after looking at this, the upstream source port selection function >>>>> will cause an explosion >>>>> of fast path flows due to an ever changing skbuff->rxhash. By using the >>>>> flow->hash, we >>>>> don't see this problem. Jesse, any comments on this particular issue? I >>>>> think using the >>>>> upstream function will allow for greater spreading across links depending >>>>> on the hashing >>>>> algorithm used on upstream switches, but will cause this flow explosion >>>>> on the host itself. >>>> >>>> Generally speaking, the OVS flow extraction pulls out more fields than >>>> are usually used by skb_get_rxhash() so I'm surprised that it would >>>> have this effect (of course, neither is supposed to be so fine grained >>>> that it breaks down a single 'real' flow). It sounds to me like there >>>> is a bug that is pulling in random data that's not supposed to be part >>>> of the flow if it changes on a per packet basis. >>>> >>> Right, I agree. I'll keep digging to see what I can find. >>> >> I figured this one out. What was throwing the hash off was the UDP source >> port for the VXLAN >> packets was apparently random garbage. Clearing it to zero before calling >> the routine to acquire >> one (which calls skb_get_rxhash() itself) fixed the problem. I'll submit the >> latest patch (which has >> this as the only change) soon. > > Good catch. The problem is that this means that we're hashing the > outer tunnel headers (which in retrospect makes sense given that we've > already started building those headers). As a result, the VXLAN > source port will always be constant for a given tunnel endpoint and > therefore not add any additional flow entropy. > > I do think that using the same set of fields as the upstream code is > generally a good idea for consistency but at this point fixing it is > probably more trouble than it is worth. It will get easier once we > can drop all of the compatibility code from OVS (soon but not quite > yet since userspace hasn't switched over yet) so it probably makes > sense to use the OVS flow hash for the time being.
OK, that makes sense. I'll resubmit the patch with the change to use the OVS flow has for the time being then. Thanks, Kyle _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev