Quoting Lee Schermerhorn ([EMAIL PROTECTED]): > On Wed, 2007-08-15 at 09:31 -0500, Serge E. Hallyn wrote: > > Quoting Lee Schermerhorn ([EMAIL PROTECTED]): > > > On Tue, 2007-08-14 at 14:56 -0700, Christoph Lameter wrote: > > > > On Tue, 14 Aug 2007, Lee Schermerhorn wrote: > > > > > > > > > > Ok then you did not have a NUMA system configured. So its okay for > > > > > > the > > > > > > dummies to ignore the stuff. CONFIG_NODES_SHIFT is a constant and > > > > > > does not > > > > > > change. The first bit is always set. > > > > > > > > > > The first bit [node 0] is only set for the N_ONLINE [and N_POSSIBLE] > > > > > mask. We could add the static init for the other masks, but since > > > > > non-numa platforms are going through the __build_all_zonelists, they > > > > > might as well set the MEMORY bits explicitly. Or, maybe you'll > > > > > disagree ;-). > > > > > > > > The bitmaps can be completely ignored if !NUMA. > > > > > > > > In the non NUMA case we define > > > > > > > > static inline int node_state(int node, enum node_states state) > > > > { > > > > return node == 0; > > > > } > > > > > > > > So its always true for node 0. The "bit" is set. > > > > > > The issue is with the N_*_MEMORY masks. They don't get initialized > > > properly because node_set_state() is a no-op if !NUMA. So, where we > > > look for intersections with or where we AND with the N_*_MEMORY masks we > > > get the empty set. > > > > > > > > > > > We are trying to get cpusets to work with !NUMA? > > Sounds reasonable. > > > > > > > > > > Well, yes. In Serge's case, he's trying to use cpusets with !NUMA. > > > He'll have to comment on the reasons for that. Looking at all of the > > > > So I can lock a container to a cpu on a non-numa machine. > > > > > #ifdefs and init/Kconfig, CPUSET does not depend on NUMA--only SMP and > > > CONTAINERS [altho' methinks CPUSET should select CONTAINERS rather than > > > depend on it...]. So, you can use cpusets to partition of cpus in > > > non-NUMA configs. > > > > > > In the more general case, tho', I'm looking at all uses of the > > > node_online_map and for_each_online_node, for instances where they > > > should be replaced with one of the *_MEMORY masks. IMO, generic code > > > that is compiled independent of any CONFIG option, like NUMA, should > > > just work, independent of the config. Currently, as Serge has shown, > > > this is not the case. So, I think we should fix the *_MEMORY maps to be > > > correctly populated in both the NUMA and !NUMA cases. A couple of > > > options: > > > > > > 1) just use node_set() when populating the masks, > > > > > > 2) initialize all masks to include at least cpu/node 0 in the !NUMA > > > case. > > > > > > Serge chose #1 to fix his problem. I followed his lead to fix the other > > > 2 places where node_set_state() was being used to initialize the NORMAL > > > memory node mask and the CPU node mask. This will add a few unnecessary > > > instructions to !NUMA configs, so we could change to #2. > > > > > > Thoughts? > > > > Paul, is the mems stuff in cpusets only really useful for NUMA cases? > > (I think it is... but am not sure) If so I suppose one alternative > > could be to just disable that when !NUMA. But disabling cpusets when > > !NUMA is completely wrong. > > > > I personally would think that 1) is still the best option. Otherwise > > the action > > > > echo $SOME_CPU > /cpusets/set1/cpu > > echo $SOME_CPU > /cpusets/set1/mems > > > > works on a numa machine, and is wrong on a non-numa machine. With > > option 1, the second part doesn't actually restrict the memory, but > > at least /cpusets/set1/mems exists and $SOME_CPU doesn't have to be 0 to > > be valid. > > Well, you really shouldn't be writing cpu ids to the cpuset mems file. > Rather, it takes node ids. And on !NUMA configs, only node 0 exists. > > Can you actually write a !0 cpuid to the mems file with the current > option #1 patch [that uses node_set() to populate the node_states[]]? > It should allow something like: > > echo 0,1<and maybe others> >/cpusets/set1/mems > > As long as one of the specified node ids has memory, it will silently > ignore any that don't. > > If you're up for it, you could try the following patch to statically > initialize the node state masks, in place of the "option 1" patch. Be > aware, tho', that I've only tested on my ia64 NUMA platform. I did > compile it [page_alloc.c] without error under allnoconfig. > > Lee > > ----------------- > PATCH Initialize N_*_MEMORY and N_CPU masks for non-NUMA config > > Against: 2.6.23-rc2-mm2 > > Statically initialize the N_*_MEMORY and N_CPU node state masks > for !NUMA configurations. This static initialization is required > because the node_set_state() function becomes a no-op for !NUMA. > Other generic code assumes that these masks are set correctly. > > Note that in NUMA configurations, these masks will be populated > correctly, so don't bother with static initialization. No sense > in making assumptions that could be broken down the road, resulting > in extra work for someone to debug. Unlikely, perhaps, but who > needs the aggravation... > > Signed-off-by: Lee Schermerhorn <[EMAIL PROTECTED]> > > mm/page_alloc.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > Index: Linux/mm/page_alloc.c > =================================================================== > --- Linux.orig/mm/page_alloc.c 2007-08-15 10:01:23.000000000 -0400 > +++ Linux/mm/page_alloc.c 2007-08-15 10:05:41.000000000 -0400 > @@ -52,7 +52,14 @@ > */ > nodemask_t node_states[NR_NODE_STATES] __read_mostly = { > [N_POSSIBLE] = NODE_MASK_ALL, > - [N_ONLINE] = { { [0] = 1UL } } > + [N_ONLINE] = { { [0] = 1UL } }, > +#ifndef CONFIG_NUMA > + [N_NORMAL_MEMORY] = { { [0] = 1UL } }, > +#ifdef CONFIG_HIGHMEM > + [N_HIGH_MEMORY] = { { [0] = 1UL } }, > +#endif > + [N_CPU] = { { [0] = 1UL } }, > +#endif /* NUMA */ > }; > EXPORT_SYMBOL(node_states);
Ok, I'll try this soon as I have a chance. To be honest when I first looked at this and saw that node_states[] was only being initialized for the first two entries this was my first instinct, except I had no idea what the right way to initilaize it was :) Cause I am completely numa-challenged. Will let you know how it works. thanks, -serge - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/