On Fri, 2007-10-26 at 13:45 -0700, David Rientjes wrote: > On Fri, 26 Oct 2007, Paul Jackson wrote: > > > Without at least this sort of change to MPOL_INTERLEAVE nodemasks, > > allowing either empty nodemasks (Lee's proposal) or extending them > > outside the current cpuset (what I'm cooking up now), there is no way > > for a task that is currently confined to a single node cpuset to say > > anything about how it wants be interleaved in the event that it is > > subsequently moved to a larger cpuset. Currently, such a task is only > > allowed to pass exactly one particular nodemask to set_mempolicy > > MPOL_INTERLEAVE calls, with exactly the one bit corresponding to its > > current node. No useful information can be passed via an API that only > > allows a single legal value. > > > > Well, passing a single node to set_mempolicy() for MPOL_INTERLEAVE doesn't > make a whole lot of sense in the first place. I prefer your solution of > allowing set_mempolicy(MPOL_INTERLEAVE, NODE_MASK_ALL) to mean "interleave > me over everything I'm allowed to access." NODE_MASK_ALL would be stored > in the struct mempolicy and used later on mpol_rebind_policy().
You don't need to save the entire mask--just note that NODE_MASK_ALL was passed--like with my internal MPOL_CONTEXT flag. This would involve special casing NODE_MASK_ALL in the error checking, as currently set_mempolicy() complains loudly if you pass non-allowed nodes--see "contextualize_policy()". [mbind() on the other hand, appears to allow any nodemask, even outside the cpuset. guess we catch this during allocation.] This is pretty much the spirit of my patch w/o the API change/extension [/improvement :)] For some systems [not mine], the nodemasks can get quite large. I have a patch, that I've tested atop Mel Gorman's "onezonelist" patches that replaces the nodemasks embedded in struct mempolicy with pointers to dynamically allocated ones. However, it's probably not much of a win, memorywise, if most of the uses are for interleave and bind policies--both of which would always need the nodemasks in addition to the pointers. Now, if we could replace the 'cpuset_mems_allowed' nodemask with a pointer to something stable, it might be a win. Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/