* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > The problem is that for each cache, you have an "per-node alien > > queues" for each node (see struct kmem_cache nodelists -> struct > > kmem_list3 alien). Moving slab metadata to struct page solves this > > but now you can only have one "queue" thats part of the same struct. > > yes, it's what i referred to as "distributed, per node cache". It has > no "quadratic overhead". It has SLAB memory spread out amongst nodes. > I.e. 1 million pages are distributed amongst 1k nodes with 1000 pages > per node with each node having 1 page. > > But that memory is not lost and it's disingenous to call it 'overhead' > and it very much comes handy for performance _IF_ there's global > workload that involves cross-node allocations. It's simply a cache > that is mis-sized and mis-constructed on large node count systems but > i bet it makes quite a performance difference on low-node-count > systems. > > On high node-count systems it might make sense to reduce the amount of > cross-node caching and to _structure_ the distributed NUMA SLAB cache > in an intelligent way (perhaps along cpuset boundaries) [...]
i think much of this could be achieved by delaying the creation of alien caches up until the point a { node X } -> { node Y } alloc/free relationship gets established. So if a system is partitioned along cpusets, vast areas of the NxN matrix never gets populated with alien caches. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/