On Wed, 2007-05-16 at 13:27 -0700, Christoph Lameter wrote: > On Wed, 16 May 2007, Peter Zijlstra wrote: > > > > So its no use on NUMA? > > > > It is, its just that we're swapping very heavily at that point, a > > bouncing cache-line will not significantly slow down the box compared to > > waiting for block IO, will it? > > How does all of this interact with > > 1. cpusets > > 2. dma allocations and highmem? > > 3. Containers?
Much like the normal kmem_cache would do; I'm not changing any of the page allocation semantics. For containers it could be that the machine is not actually swapping but the container will be in dire straights. > > > The problem here is that you may spinlock and take out the slab for one > > > cpu but then (AFAICT) other cpus can still not get their high priority > > > allocs satisfied. Some comments follow. > > > > All cpus are redirected to ->reserve_slab when the regular allocations > > start to fail. > > And the reserve slab is refilled from page allocator reserves if needed? Yes, using new_slab(), exacly as it would normally be. > > > But this is only working if we are using the slab after > > > explicitly flushing the cpuslabs. Otherwise the slab may be full and we > > > get to alloc_slab. > > > > /me fails to parse. > > s->cpu[cpu] is only NULL if the cpu slab was flushed. This is a pretty > rare case likely not worth checking. Ah, right: - !page || !page->freelist - and no available partial slabs. then we try the reserve (if we're entiteld). > > > Remove the above two lines (they are wrong regardless) and simply make > > > this the cpu slab. > > > > It need not be the same node; the reserve_slab is node agnostic. > > So here the free page watermarks are good again, and we can forget all > > about the ->reserve_slab. We just push it on the free/partial lists and > > forget about it. > > > > But like you said above: unfreeze_slab() should be good, since I don't > > use the lockless_freelist. > > You could completely bypass the regular allocation functions and do > > object = s->reserve_slab->freelist; > s->reserve_slab->freelist = object[s->reserve_slab->offset]; That is basically what happens at the end; if an object is returned from the reserve slab. But its wanted to try the normal cpu_slab path first to detect that the situation has subsided and we can resume normal operation. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/