On Sat, Nov 25, 2006 at 11:11:53PM -0800, Andrew Morton wrote: > On Sat, 25 Nov 2006 22:00:45 -0500 > Dave Jones <[EMAIL PROTECTED]> wrote: > > > On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote: > > > On Sat, 25 Nov 2006 13:03:45 -0800 > > > "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: > > > > > > > On 2.6.18-rc7 and later during LTP: > > > > http://test.kernel.org/abat/48393/debug/console.log > > > > > > The traces are a bit confusing, but I don't actually see anything wrong > > > there. The machine has used up all swap, has used up all memory and has > > > correctly gone and killed things. After that, there's free memory > > again. > > > > We covered this a month or two back. For RHEL5, we've ended up > > reintroducing the oom killer prevention logic that we had up until > > circa 2.6.10. It seemed that there exist circumstances where > > given a little more time, some memory hogging apps will run to completion > > allowing other allocators to succeed instead of being killed. > > I _think_ what you're describing here is a false-positive oom-killing? But > Martin appears to be hitting a genuine oom. what we saw during the rhel5 testing was that yes, the machine _was_ OOM *temporarily*, but if instead of killing the task trying to allocate, we postponed the killing a few times, it would give other tasks the opportunity to complete writeout, or free up memory some other way, allowing the allocating process to succeed shortly afterwards.
> But it does appear that some changes are needed, because lots of things got > oom-killed. > > I think. Maybe not - there's no timestamping in those logs and it is of > course possible that we're seeing unrelated ooms which happened a long time > apart. Maybe, but it does sound spookily familiar. The last time Larry's patch got floated to lkml it was met with "Ah!, but we have new oom killer changes in -git which might solve this". We tried them. They didn't. Dave -- http://www.codemonkey.org.uk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/