Re: the new VM

2000-10-09 Thread Christoph Rohland
Rik van Riel <[EMAIL PROTECTED]> writes: > Hmmm, could you help me with drawing up a selection algorithm > on how to choose which SHM segment to destroy when we run OOM? > > The criteria would be about the same as with normal programs: > > 1) minimise the amount of work lost > 2) try to protect

Re: the new VM

2000-10-06 Thread Rik van Riel
[replying to a really old email now that I've started work on integrating the OOM handler] On 25 Sep 2000, Christoph Rohland wrote: > Rik van Riel <[EMAIL PROTECTED]> writes: > > > > Because as you said the machine can lockup when you run out of memory. > > > > The fix for this is to kill a us

Re: the new VM

2000-09-27 Thread Andrea Arcangeli
On Wed, Sep 27, 2000 at 09:42:45AM +0200, Ingo Molnar wrote: > such screwups by checking for NULL and trying to handle it. I suggest to > rather fix those screwups. How do you know which is the minimal amount of RAM that allows you not to be in the screwedup state? We for sure need a kind of cou

Re: the new VM

2000-09-27 Thread yodaiken
On Wed, Sep 27, 2000 at 09:42:45AM +0200, Ingo Molnar wrote: > > On Tue, 26 Sep 2000, Pavel Machek wrote: > of the VM allocation issues. Returning NULL in kmalloc() is just a way to > say: 'oops, we screwed up somewhere'. And i'd suggest to not work around That is not at all how it is currently

Re: the new VM

2000-09-27 Thread Ingo Molnar
On Tue, 26 Sep 2000, Pavel Machek wrote: > Okay, I'm user on small machine and I'm doing stupid thing: I've got > 6MB ram, and I keep inserting modules. I insert module_1mb.o. Then I > insert module_1mb.o. Repeat. How does it end? I think that > kmalloc(GFP_KERNEL) *has* to return NULL at some p

Re: the new VM

2000-09-26 Thread Andrea Arcangeli
On Tue, Sep 26, 2000 at 09:10:16PM +0200, Pavel Machek wrote: > Hi! > > > i talked about GFP_KERNEL, not GFP_USER. Even in the case of GFP_USER i > > > > My bad, you're right I was talking about GFP_USER indeed. > > > > But even GFP_KERNEL allocations like the init of a module or any other thing

Re: the new VM

2000-09-26 Thread Pavel Machek
Hi! > > i talked about GFP_KERNEL, not GFP_USER. Even in the case of GFP_USER i > > My bad, you're right I was talking about GFP_USER indeed. > > But even GFP_KERNEL allocations like the init of a module or any other thing > that is static sized during production just checking the retval > looks

Re: the new VM

2000-09-25 Thread Christoph Rohland
Hi Rik, Rik van Riel <[EMAIL PROTECTED]> writes: > > Because as you said the machine can lockup when you run out of memory. > > The fix for this is to kill a user process when you're OOM > (you need to do this anyway). > > The last few allocations of the "condemned" process can come > frome th

Re: the new VM

2000-09-25 Thread Rik van Riel
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > On Mon, Sep 25, 2000 at 04:27:24PM +0200, Ingo Molnar wrote: > > i think an application should not fail due to other applications > > allocating too much RAM. OOM behavior should be a central thing and based > > At least Linus's point is that doing p

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 04:40:44PM +0100, Stephen C. Tweedie wrote: > Allowing GFP_ATOMIC to eat PF_MEMALLOC's last-chance pages is the > wrong thing to do if we want to guarantee swapper progress under > extreme load. You're definitely right. We at least need the garantee of the memory to alloca

Re: the new VM

2000-09-25 Thread Stephen C. Tweedie
Hi, On Mon, Sep 25, 2000 at 03:47:03PM +0100, Alan Cox wrote: > > GFP_KERNEL has to be able to fail for 2.4. Otherwise you can get everything > jammed in kernel space waiting on GFP_KERNEL and if the swapper cannot make > space you die. We already have PF_MEMALLOC to provide a last-chance alloc

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 05:16:06PM +0200, Ingo Molnar wrote: > situation is just 1% RAM away from the 'root cannot log in', situation. The root cannot log in is a little different. Just think that in the "root cannot log in" you only need to press SYSRQ+E (or as worse +I). If all tasks in the sy

Re: the new VM

2000-09-25 Thread yodaiken
On Mon, Sep 25, 2000 at 05:26:59PM +0200, Ingo Molnar wrote: > > On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > > > i think the GFP_USER case should do the oom logic within __alloc_pages(), > > > > What's the difference of implementing the logic outside alloc_pages? > > Putting the logic insi

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Alan Cox wrote: > Unless Im missing something here think about this case > > 2 active processes, no swap > > #1#2 > kmalloc 32K kmalloc 16K > OKOK > kmalloc 16K

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > i think the GFP_USER case should do the oom logic within __alloc_pages(), > > What's the difference of implementing the logic outside alloc_pages? > Putting the logic inside looks not clean design to me. it gives consistency and simplicity. The

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 05:10:43PM +0200, Ingo Molnar wrote: > a SIGKILL? i agree with the 2.2 solution - first a soft signal, and if > it's being ignored then a SIGKILL. Actually we do the soft signal try (SIGTERM) only if the task was running with iopl privilegies (and that means on alpha and o

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Alan Cox wrote: > GFP_KERNEL has to be able to fail for 2.4. Otherwise you can get > everything jammed in kernel space waiting on GFP_KERNEL and if the > swapper cannot make space you die. if one can get everything jammed waiting for GFP_KERNEL, and not being able to deallo

Re: Swap on RAID; was: Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000 [EMAIL PROTECTED] wrote: > > this is fixed in 2.4. The 2.2 RAID code is frozen, and has known > > limitations (ie. due to the above RAID1 cannot be used as a swap-device). > as commonly patched in by RedHat? Should I instead use a swap file > for a machine that should be fa

Swap on RAID; was: Re: the new VM

2000-09-25 Thread parsley
Ingo Molnar wrote: > this is fixed in 2.4. The 2.2 RAID code is frozen, and has known > limitations (ie. due to the above RAID1 cannot be used as a swap-device). Eh, just to be clear about this: does this apply to the RAID 0.90 code as commonly patched in by RedHat? Should I instead use a swap

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > Signal can be trapped and ignored by malicious task. [...] a SIGKILL? i agree with the 2.2 solution - first a soft signal, and if it's being ignored then a SIGKILL. > But my question isn't what you do when you're OOM, but is _how_ do you > notice

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 04:43:44PM +0200, Ingo Molnar wrote: > i talked about GFP_KERNEL, not GFP_USER. Even in the case of GFP_USER i My bad, you're right I was talking about GFP_USER indeed. But even GFP_KERNEL allocations like the init of a module or any other thing that is static sized durin

Re: the new VM

2000-09-25 Thread Alan Cox
> > Because as you said the machine can lockup when you run out of memory. > > well, i think all kernel-space allocations have to be limited carefully, > denying succeeding allocations is not a solution against over-allocation, > especially in a multi-user environment. GFP_KERNEL has to be able

Re: the new VM

2000-09-25 Thread Rik van Riel
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > On Mon, Sep 25, 2000 at 03:02:58PM +0200, Ingo Molnar wrote: > > On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > > > > Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed > > > that is a showstopper bug. [...] > > > > why? > > Becau

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 11:26:48AM -0300, Marcelo Tosatti wrote: > This thread keeps freeing pages from the inactive clean list when needed > (when zone->free_pages < zone->pages_low), making them available for > atomic allocations. This is flawed. It's the irq that have to shrink the memory itse

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > At least Linus's point is that doing perfect accounting (at least on > the userspace allocation side) may cause you to waste resources, > failing even if you could still run and I tend to agree with him. > We're lazy on that side and that's global w

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 04:27:24PM +0200, Ingo Molnar wrote: > i think an application should not fail due to other applications > allocating too much RAM. OOM behavior should be a central thing and based At least Linus's point is that doing perfect accounting (at least on the userspace allocation

Re: the new VM

2000-09-25 Thread Marcelo Tosatti
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > I talked with Alexey about this and it seems the best way is to have a > per-socket reservation of clean cache in function of the receive window. So we > don't need an huge atomic pool but we can have a special lru with an irq > spinlock that is

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > I'm not sure if we should restrict the limiting only to the cases that > needs them. For example do_anonymous_page looks a place that could > rely on the GFP retval. i think an application should not fail due to other applications allocating too mu

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 04:04:14PM +0200, Ingo Molnar wrote: > exactly, and this is why if a higher level lets through a GFP_KERNEL, then > it *must* succeed. Otherwise either the higher level code is buggy, or the > VM balance is buggy, but we want to have clear signs of it. I'm not sure if we s

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:39:51PM +0200, Ingo Molnar wrote: > Andrea, if you really mean this then you should not be let near the VM > balancing code :-) What I mean is that the VM balancing is in the lower layer that knows anything about the per-socket gigabit ethernet skbs limits, the limit sh

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:47:57PM +0200, Ingo Molnar wrote: > this was actually coded/fixed by Neil Brown - so the kudos go to him! Indeed :). Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at ht

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > Again: the bean counting and all the limit happens at the higher > layer. I shouldn't know anything about it when I play with the lower > layer GFP memory balancing code. exactly, and this is why if a higher level lets through a GFP_KERNEL, then i

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO > > latencies) > > Very good! Many thanks Ingo. this was actually coded/fixed by Neil Brown - so the kudos go to him! Ingo - To unsubscribe from this list: send the line

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > And if the careful limit avoids the deadlock in the layer above > alloc_pages, then it will also avoid alloc_pages to return NULL and > you won't need an infinite loop in first place (unless the memory > balancing is buggy). yes i like this propert

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:21:01PM +0200, Ingo Molnar wrote: > yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO > latencies) Very good! Many thanks Ingo. Andrea - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:12:58PM +0200, Ingo Molnar wrote: > well, i think all kernel-space allocations have to be limited carefully, When a machine without a gigabit ethernet runs oom it's userspace that allocated the memory via page faults not the kernel. And if the careful limit avoids the

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > Is it safe to sleep on the waitqueue in the kmalloc fail path in > raid1? yes. every RAID1-bh has a bound lifetime. (bound by worst-case IO latencies) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the b

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > huh, what do you mean? > > I mean this: > > while (!( /* FIXME: now we are rather fault tolerant than nice */ this is fixed in 2.4. The 2.2 RAID code is frozen, and has known limitations (ie. due to the above RAID1 cannot be used

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:04:10PM +0200, Ingo Molnar wrote: > > On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > > Please fix raid1 instead of making things worse. > > huh, what do you mean? I mean this: while (!( /* FIXME: now we are rather fault tolerant than nice */

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > > Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed > > > that is a showstopper bug. [...] > > > > why? > > Because as you said the machine can lockup when you run out of memory. well, i think all kernel-space allocations have

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 03:02:58PM +0200, Ingo Molnar wrote: > > On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > > > Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed > > that is a showstopper bug. [...] > > why? Because as you said the machine can lockup when you run out of me

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > Please fix raid1 instead of making things worse. huh, what do you mean? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org

Re: the new VM

2000-09-25 Thread Ingo Molnar
On Mon, 25 Sep 2000, Andrea Arcangeli wrote: > Sorry I totally disagree. If GFP_KERNEL are garanteeded to succeed > that is a showstopper bug. [...] why? > machine power for simulations runs out of memory all the time. If you > put this kind of obvious deadlock into the main kernel allocator

Re: the new VM

2000-09-25 Thread Andrea Arcangeli
On Mon, Sep 25, 2000 at 12:42:09PM +0200, Ingo Molnar wrote: > believe could simplify unrelated kernel code significantly. Eg. no need to > check for NULL pointers on most allocations, a GFP_KERNEL allocation > always succeeds, end of story. This behavior also has the 'nice' Sorry I totally disag