On Tue, 2007-08-21 at 14:29 -0700, Christoph Lameter wrote: > On Tue, 21 Aug 2007, Peter Zijlstra wrote: > > > It quickly ends up with all of memory in the laundry list and then > > recursing into __alloc_pages which will fail to make progress and OOMs. > > Hmmmm... Okay that needs to be addressed. Reserves need to be used and we > only should enter reclaim if that runs out (like the first patch that I > did). > > > But aside from the numerous issues with the patch set as presented, I'm > > not seeing the seeing the big picture, why are you doing this. > > I want general improvements to reclaim to address the issues that you see > and other issues related to reclaim instead of the strange code that makes > PF_MEMALLOC allocs compete for allocations from a single slab and putting > logic into the kernel to decide which allocs to fail. We can reclaim after > all. Its just a matter of finding the right way to do this.
The latest patch I posted got rid of that global slab. Also, all I want is for slab to honour gfp flags like page allocation does, nothing more, nothing less. (well, actually slightly less, since I'm only really interrested in the ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER -> ALLOC_NO_WATERMARKS transition and not all higher ones) I want slab to fail when a similar page alloc would fail, no magic. Strictly speaking: if: page = alloc_page(gfp); fails but: obj = kmem_cache_alloc(s, gfp); succeeds then its a bug. But I'm not actually needing it that strict, just the ALLOC_NO_WATERMARK part needs to be done, ALLOC_HARDER, ALLOC_HIGH those may fudge a bit. > > Anonymous pages are a there to stay, and we cannot tell people how to > > use them. So we need some free or freeable pages in order to avoid the > > vm deadlock that arises from all memory dirty. > > No one is trying to abolish Anonymous pages. Free memory is readily > available on demand if one calls reclaim. Your scheme introduces complex > negotiations over a few scraps of memory when large amounts of memory > would still be readily available if one would do the right thing and call > into reclaim. This is the thing I contend, there need not be large amounts of memory around. In my test prog the hot code path fits into a single page, the rest can be anonymous. > > 'Optimizing' this by switching to freeable pages has mainly > > disadvantages IMHO, finding them scrambles LRU order and complexifies > > relcaim and all that for a relatively small gain in space for clean > > pagecache pages. > > Sounds like you would like to change the way we handle memory in general > in the VM? Reclaim (and thus finding freeable pages) is basic to Linux > memory management. Not quite, currently we have free pages in the reserves, if you want to replace some (or all) of that by freeable pages then that is a change. I'm just using the reserves. > > Please, stop writing patches and write down a solid proposal of how you > > envision the VM working in the various scenarios and why its better than > > the current approach. > > Sorry I just got into this a short time ago and I may need a few cycles > to get this all straight. An approach that uses memory instead of > ignoring available memory is certainly better. Sure if and when possible. There will always be need to fall back to the reserves. A bit off-topic, re that reclaim from atomic context: Currently we try to hold spinlocks only for short periods of time so that reclaim can be preempted, if you run all of reclaim from a non-preemptible context you get very large preemption latencies and if done from int context it'd also generate large int latencies. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/