Am Wed, 17 Aug 2016 00:19:53 +0200 schrieb Heiko Carstens <heiko.carst...@de.ibm.com>:
> On Tue, Aug 16, 2016 at 11:42:05AM -0400, Tejun Heo wrote: > > Hello, Peter. > > > > On Tue, Aug 16, 2016 at 05:29:49PM +0200, Peter Zijlstra wrote: > > > On Tue, Aug 16, 2016 at 11:20:27AM -0400, Tejun Heo wrote: > > > > As long as the mapping doesn't change after the first onlining > > > > of the CPU, the workqueue side shouldn't be too difficult to > > > > fix up. I'll look into it. For memory allocations, as long as > > > > the cpu <-> node mapping is established before any memory > > > > allocation for the cpu takes place, it should be fine too, I > > > > think. > > > > > > Don't we allocate per-cpu memory for 'cpu_possible_map' on boot? > > > There's a whole bunch of per-cpu memory users that does things > > > like: > > > > > > > > > for_each_possible_cpu(cpu) { > > > struct foo *foo = per_cpu_ptr(&per_cpu_var, cpu); > > > > > > /* muck with foo */ > > > } > > > > > > > > > Which requires a cpu->node map for all possible cpus at boot time. > > > > Ah, right. If cpu -> node mapping is dynamic, there isn't much that > > we can do about allocating per-cpu memory on the wrong node. And it > > is problematic that percpu allocations can race against an onlining > > CPU switching its node association. > > > > One way to keep the mapping stable would be reserving per-node > > possible CPU slots so that the CPU number assigned to a new CPU is > > on the right node. It'd be a simple solution but would get really > > expensive with increasing number of nodes. > > > > Heiko, do you have any ideas? > > I think the easiest solution would be to simply assign all cpus, for > which we do not have any topology information, to an arbitrary node; > e.g. round robin. > > After all the case that cpus are added later is rare and the s390 > fake numa implementation does not know about the memory topology. All > it is doing is distributing the memory to several nodes in order to > avoid a single huge node. So that should be sort of ok. > > Unless somebody has a better idea? > > Michael, Martin? If it is really required that cpu_to_node() can be called for all possible cpus this sounds like a reasonable workaround to me. Michael