> On Oct 9, 2015, at 3:11 PM, Jesse Gross <je...@nicira.com> wrote: > > On Fri, Oct 9, 2015 at 8:54 AM, Jarno Rajahalme <jrajaha...@nicira.com > <mailto:jrajaha...@nicira.com>> wrote: >> >> On Oct 8, 2015, at 4:03 PM, Jesse Gross <je...@nicira.com> wrote: >> >> On Wed, Oct 7, 2015 at 10:47 AM, Jarno Rajahalme <jrajaha...@nicira.com> >> wrote: >> >> >> On Oct 6, 2015, at 6:01 PM, Jesse Gross <je...@nicira.com> wrote: >> >> On Mon, Oct 5, 2015 at 1:25 PM, Alexander Duyck >> <alexander.du...@gmail.com> wrote: >> >> On 10/05/2015 06:59 AM, Vlastimil Babka wrote: >> >> >> On 10/02/2015 12:18 PM, Konstantin Khlebnikov wrote: >> >> >> When openvswitch tries allocate memory from offline numa node 0: >> stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, >> 0) >> It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid)) >> [ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h >> This patch disables numa affinity in this case. >> >> Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru> >> >> >> >> ... >> >> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c >> index f2ea83ba4763..c7f74aab34b9 100644 >> --- a/net/openvswitch/flow_table.c >> +++ b/net/openvswitch/flow_table.c >> @@ -93,7 +93,8 @@ struct sw_flow *ovs_flow_alloc(void) >> >> /* Initialize the default stat node. */ >> stats = kmem_cache_alloc_node(flow_stats_cache, >> - GFP_KERNEL | __GFP_ZERO, 0); >> + GFP_KERNEL | __GFP_ZERO, >> + node_online(0) ? 0 : NUMA_NO_NODE); >> >> >> >> Stupid question: can node 0 become offline between this check, and the >> VM_WARN_ON? :) BTW what kind of system has node 0 offline? >> >> >> >> Another question to ask would be is it possible for node 0 to be online, but >> be a memoryless node? >> >> I would say you are better off just making this call kmem_cache_alloc. I >> don't see anything that indicates the memory has to come from node 0, so >> adding the extra overhead doesn't provide any value. >> >> >> I agree that this at least makes me wonder, though I actually have >> concerns in the opposite direction - I see assumptions about this >> being on node 0 in net/openvswitch/flow.c. >> >> Jarno, since you original wrote this code, can you take a look to see >> if everything still makes sense? >> >> >> We keep the pre-allocated stats node at array index 0, which is initially >> used by all CPUs, but if CPUs from multiple numa nodes start updating the >> stats, we allocate additional stats nodes (up to one per numa node), and the >> CPUs on node 0 keep using the preallocated entry. If stats cannot be >> allocated from CPUs local node, then those CPUs keep using the entry at >> index 0. Currently the code in net/openvswitch/flow.c will try to allocate >> the local memory repeatedly, which may not be optimal when there is no >> memory at the local node. >> >> Allocating the memory for the index 0 from other than node 0, as discussed >> here, just means that the CPUs on node 0 will keep on using non-local memory >> for stats. In a scenario where there are CPUs on two nodes (0, 1), but only >> the node 1 has memory, a shared flow entry will still end up having separate >> memory allocated for both nodes, but both of the nodes would be at node 1. >> However, there is still a high likelihood that the memory allocations would >> not share a cache line, which should prevent the nodes from invalidating >> each other’s caches. Based on this I do not see a problem relaxing the >> memory allocation for the default stats node. If node 0 has memory, however, >> it would be better to allocate the memory from node 0. >> >> >> Thanks for going through all of that. >> >> It seems like the question that is being raised is whether it actually >> makes sense to try to get the initial memory on node 0, especially >> since it seems to introduce some corner cases? Is there any reason why >> the flow is more likely to hit node 0 than a randomly chosen one? >> (Assuming that this is a multinode system, otherwise it's kind of a >> moot point.) We could have a separate pointer to the default allocated >> memory, so it wouldn't conflict with memory that was intentionally >> allocated for node 0. >> >> >> It would still be preferable to know from which node the default stats node >> was allocated, and store it in the appropriate pointer in the array. We >> could then add a new “default stats node index” that would be used to locate >> the node in the array of pointers we already have. That way we would avoid >> extra allocation and processing of the default stats node. > > I agree, that sounds reasonable to me. Will you make that change? > > Besides eliminating corner cases, it might help performance in some > cases too by avoiding stressing memory bandwidth on node 0.
I’ll do this, Jarno _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev