On Thu, Nov 24, 2016 at 12:05:12AM +1100, Balbir Singh wrote: > On my desktop NODES_SHIFT is 6, many distro kernels have it a 9. I've known > of solutions that use fake NUMA for partitioning and need as many nodes as > possible.
It was a crude kludge that people used before memcg. If people still use it, that's fine but we don't want to optimize / make code complicated for it, so let's please put away this part of justification. It's understandable that some kernels want to have large NODES_SHIFT to support wide range of configurations but if that makes wastage too high, the simpler solution is updating the users to use the rumtime detected possible number / mask instead of the compile time NODES_SHIFT. Note that we do exactly the same thing for per-cpu things - we configure high max but do all operations on what's possible on the system. NUMA code already has possible detection. Why not simply make memcg use those instead of MAX_NUMNODES like how we use nr_cpu_ids instead of NR_CPUS? Thanks. -- tejun