On Thu, Oct 1, 2015 at 4:19 PM, <dwil...@us.ibm.com> wrote: > > Quoting Jesse Gross <je...@nicira.com>: > >> On Tue, Sep 29, 2015 at 10:50 PM, <dwil...@us.ibm.com> wrote: >>> >>> Hi- >>> >>> I have been conducting scaling tests with OVS and docker. My tests >>> revealed >>> that the latency of ARP packets can become very large resulting in many >>> ARP >>> re-transmissions and time-outs. I found the source of the poor latency to >>> be >>> with the handling of arp packets in ovs_vport_find_upcall_portid(). Each >>> packet is hashed in ovs_vport_find_upcall_portid() by calling >>> skb_get_hash(). This hash is used to select a netlink socket in which to >>> send the packet to userspace. However, skb_get_hash() is not supporting >>> ARP >>> packets returning a 0 (invalid hash) for every ARP. This results in a >>> single ovs-vswitchd handler thread processing every arp packet thus >>> severely >>> impacting the average latency of ARPs. I am purposing a change to >>> ovs_vport_find_upcall_portid() that spreads the ARP packets evenly >>> between >>> all the handler threads (patch to follow). Please let me know if you >>> have >>> suggestions/comments. >> >> >> This is definitely an interesting analysis but I'm a little surprised >> at the basic scenario. First, I guess it seems to me that the L2 >> domain is too large if there are this many ARPs. > > > I can imagine running a couple of thousand docker containers, so I think > this is a reasonable size test.
Having thousands of nodes (regardless of whether they are containers or VMs) on a single L2 segment is really not a good idea. I would expect them to be segmented into smaller groups with L3 boundaries in the middle. > On a related issue, I am looking into the memory consumed by the netlink > sockets, OVS on linux can create many of these sockets. Do you have any > thought as to why the current model was picked? Independent queues are the easiest way to provide lockless access to incoming packets on different cores and, in some case, give higher priority to certain types of packets. >> The speed also >> generally seems slower than I would expect but in any case I don't >> disagree that it is better to spread the load among all the cores. >> >> On the patch itself, can't we just make skb_get_hash() be able to >> decode ARP? It seems like that is cleaner and more generic. > > > My first thought was to make a change in skb_get_hash(). However, the > comment in __skb_get_hash() state that the hash is generated from the > 4-tuple (address and ports). ARPs have no ports so a return value of 0 > looked correct. > > /* > * __skb_get_hash: calculate a flow hash based on src/dst addresses > * and src/dst port numbers. Sets hash in skb to non-zero hash value > * on success, zero indicates no valid hash. Also, sets l4_hash in skb > * if hash is a canonical 4-tuple hash over transport ports. > */ > > What do you think? I don't think that this is really a strict definition. In particular, IP packets that aren't TCP or UDP will still return a hash based on the IP addresses. However, I believe that you are looking at an old version of this function. Any changes would need to be made to the upstream Linux tree, not purely in OVS. _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss