Hi, > -----Original Message----- > From: David Rientjes [mailto:rient...@google.com] > Sent: Thursday, October 15, 2015 3:35 AM > To: PINTU KUMAR > Cc: a...@linux-foundation.org; minc...@kernel.org; d...@stgolabs.net; > mho...@suse.cz; koc...@gmail.com; han...@cmpxchg.org; penguin-kernel@i- > love.sakura.ne.jp; bywxiao...@163.com; mgor...@suse.de; vba...@suse.cz; > js1...@gmail.com; kirill.shute...@linux.intel.com; > alexander.h.du...@redhat.com; sasha.le...@oracle.com; c...@linux.com; > fengguang...@intel.com; linux-kernel@vger.kernel.org; linux...@kvack.org; > c...@samsung.com; pintu_agar...@yahoo.com; pintu.p...@gmail.com; > vishnu...@samsung.com; rohit...@samsung.com; c.rajku...@samsung.com > Subject: RE: [RESEND PATCH 1/1] mm: vmstat: Add OOM victims count in vmstat > counter > > On Wed, 14 Oct 2015, PINTU KUMAR wrote: > > > For me it was very helpful during sluggish and long duration ageing tests. > > With this, I don't have to look into the logs manually. > > I just monitor this count in a script. > > The moment I get nr_oom_victims > 1, I know that kernel OOM would have > > happened and I need to take the log dump. > > So, then I do: dmesg >> oom_logs.txt > > Or, even stop the tests for further tuning. > > > > I think eventfd(2) was created for that purpose, to avoid the constant polling > that you would have to do to check nr_oom_victims and then take a snapshot. > > > > I disagree with this one, because we can encounter oom kills due to > > > fragmentation rather than low memory conditions for high-order allocations. > > > The amount of free memory may be substantially higher than all zone > > > watermarks. > > > > > AFAIK, kernel oom happens only for lower-order > (PAGE_ALLOC_COSTLY_ORDER). > > For higher-order we get page allocation failure. > > > > Order-3 is included. I've seen machines with _gigabytes_ of free memory in > ZONE_NORMAL on a node and have an order-3 page allocation failure that > called the oom killer. > Yes, if PAGE_ALLOC_COSTLY_ORDER is defined as 3, then order-3 will be included for OOM. But that's fine. We are just interested to know if system entered oom state. That's the reason, earlier I added even _oom_stall_ to know if system ever entered oom but resulted into page allocation failure instead of oom killing.
> > > We've long had a desire to have a better oom reporting mechanism > > > rather than just the kernel log. It seems like you're feeling the > > > same pain. I think it > > would be > > > better to have an eventfd notifier for system oom conditions so we > > > can track kernel oom kills (and conditions) in userspace. I have a > > > patch for that, and > > it > > > works quite well when userspace is mlocked with a buffer in memory. > > > > > Ok, this would be interesting. > > Can you point me to the patches? > > I will quickly check if it is useful for us. > > > > https://lwn.net/Articles/589404. It's invasive and isn't upstream. I would like to > restructure that patchset to avoid the memcg trickery and allow for a root-only > eventfd(2) notification through procfs on system oom. I am interested only in global oom case and not memcg. We have memcg enabled but I think even memcg_oom will finally invoke _oom_kill_process_. So, I am interested in a patchset that can trigger notifications from oom_kill_process, as soon as any victim is killed. Sorry, from your patchset, I could not actually local the system_oom notification patch. If you have similar patchset please point me to it. It will be really helpful. Thank you! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/