On Thu, 13 Sep 2007, Peter Zijlstra wrote: > > > > Every user of memory relies on the VM, and we only get into trouble if > > > the VM in turn relies on one of these users. Traditionally that has only > > > been the block layer, and we special cased that using mempools and > > > PF_MEMALLOC. > > > > > > Why do you object to me doing a similar thing for networking? > > > > I have not seen you using mempools for the networking layer. I would not > > object to such a solution. It already exists for other subsystems. > > Dude, listen, how often do I have to say this: I cannot use mempools for > the network subsystem because its build on kmalloc! What I've done is > build a replacement for mempools - a reserve system - that does work > similar to mempools but also provides the flexibility of kmalloc. > > That is all, no more, no less.
Its different since it becomes a privileged player that can suck all the available memory out of the page allocator. > I'm confused by this, I've never claimed part of, or such a thing. All > I'm saying is that because of the circular dependency between the VM and > the IO subsystem used for swap (not file backed paging [*], just swap) > you have to do something special to avoid deadlocks. How are dirty file backed pages different? They may also be written out by the VM during reclaim. > > Replacing the mempools for the block layer sounds pretty good. But how do > > these various subsystems that may live in different portions of the system > > for various devices avoid global serialization and livelock through your > > system? > > The reserves are spread over all kernel mapped zones, the slab allocator > is still per cpu, the page allocator tries to get pages from the nearest > node. But it seems that you have unbounded allocations with PF_MEMALLOC now for the networking case? So networking can exhaust all reserves? > > And how is fairness addresses? I may want to run a fileserver on > > some nodes and a HPC application that relies on a fiberchannel connection > > on other nodes. How do we guarantee that the HPC application is not > > impacted if the network services of the fileserver flood the system with > > messages and exhaust memory? > > The network system reserves A pages, the block layer reserves B pages, > once they start getting pages from the reserves they go bean counting, > once they reach their respective limit they stop. That sounds good. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/