On Fri, 6 Mar 2015, Michael Ellerman wrote:

> > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> > > index 0257a7d659ef..24de29b3651b 100644
> > > --- a/arch/powerpc/mm/numa.c
> > > +++ b/arch/powerpc/mm/numa.c
> > > @@ -958,9 +958,17 @@ void __init initmem_init(void)
> > >  
> > >   memblock_dump_all();
> > >  
> > > + /*
> > > +  * zero out the possible nodes after we parse the device-tree,
> > > +  * so that we lower the maximum NUMA node ID to what is actually
> > > +  * present.
> > > +  */
> > > + nodes_clear(node_possible_map);
> > > +
> > >   for_each_online_node(nid) {
> > >           unsigned long start_pfn, end_pfn;
> > >  
> > > +         node_set(nid, node_possible_map);
> > >           get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
> > >           setup_node_data(nid, start_pfn, end_pfn);
> > >           sparse_memory_present_with_active_regions(nid);
> > 
> > This seems a bit strange, node_possible_map is supposed to be a superset 
> > of node_online_map and this loop is iterating over node_online_map to set 
> > nodes in node_possible_map.
>  
> Yeah. Though at this point in boot I don't think it matters that the two maps
> are out-of-sync temporarily.
> 
> But it would simpler to just set the possible map to be the online map. That
> would also maintain the invariant that the possible map is always a superset 
> of
> the online map.
> 
> Or did I miss a detail there (sleep deprived parent mode).
> 

I think reset_numa_cpu_lookup_table() which iterates over the possible 
map, and thus only a subset of nodes now, may be concerning.

I'm not sure why this is being proposed as a powerpc patch and now a patch 
for mem_cgroup_css_alloc().  In other words, why do we have to allocate 
for all possible nodes?  We should only be allocating for online nodes in 
N_MEMORY with mem hotplug disabled initially and then have a mem hotplug 
callback implemented to alloc_mem_cgroup_per_zone_info() for nodes that 
transition from memoryless -> memory.  The extra bonus is that 
alloc_mem_cgroup_per_zone_info() need never allocate remote memory and the 
TODO in that function can be removed.
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to