Hi,

Sorry, I forgot to mention the V2 update.
I will highlight the V2 changes and RESEND.

> -----Original Message-----
> From: Pintu Kumar [mailto:pint...@samsung.com]
> Sent: Monday, October 12, 2015 7:03 PM
> To: a...@linux-foundation.org; minc...@kernel.org; d...@stgolabs.net;
> pint...@samsung.com; mho...@suse.cz; koc...@gmail.com;
> rient...@google.com; han...@cmpxchg.org; penguin-kernel@i-
> love.sakura.ne.jp; bywxiao...@163.com; mgor...@suse.de; vba...@suse.cz;
> js1...@gmail.com; kirill.shute...@linux.intel.com;
> alexander.h.du...@redhat.com; sasha.le...@oracle.com; c...@linux.com;
> fengguang...@intel.com; linux-kernel@vger.kernel.org; linux...@kvack.org
> Cc: c...@samsung.com; pintu_agar...@yahoo.com; pintu.p...@gmail.com;
> vishnu...@samsung.com; rohit...@samsung.com; c.rajku...@samsung.com;
> sreena...@samsung.com
> Subject: [PATCH 1/1] mm: vmstat: Add OOM victims count in vmstat counter
> 
> This patch maintains the number of oom victims kill count in /proc/vmstat.
> Currently, we are dependent upon kernel logs when the kernel OOM occurs.
> But kernel OOM can went passed unnoticed by the developer as it can silently
> kill some background applications/services.
> In some small embedded system, it might be possible that OOM is captured in
> the logs but it was over-written due to ring-buffer.
> Thus this interface can quickly help the user in analyzing, whether there were
> any OOM kill happened in the past, or whether the system have ever entered
> the oom kill stage till date.
> 
> Thus, it can be beneficial under following cases:
> 1. User can monitor kernel oom kill scenario without looking into the
>    kernel logs.
> 2. It can help in tuning the watermark level in the system.
> 3. It can help in tuning the low memory killer behavior in user space.
> 4. It can be helpful on a logless system or if klogd logging
>    (/var/log/messages) are disabled.
> 
> A snapshot of the result of 3 days of over night test is shown below:
> System: ARM Cortex A7, 1GB RAM, 8GB EMMC
> Linux: 3.10.xx
> Category: reference smart phone device
> Loglevel: 7
> Conditions: Fully loaded, BT/WiFi/GPS ON
> Tests: auto launching of ~30+ apps using test scripts, in a loop for
> 3 days.
> At the end of tests, check:
> $ cat /proc/vmstat
> nr_oom_victims 6
> 
> As we noticed, there were around 6 oom kill victims.
> 
> The OOM is bad for any system. So, this counter can help in quickly tuning the
> OOM behavior of the system, without depending on the logs.
> 
> Signed-off-by: Pintu Kumar <pint...@samsung.com>
> ---
>  include/linux/vm_event_item.h |    1 +
>  mm/oom_kill.c                 |    2 ++
>  mm/page_alloc.c               |    1 -
>  mm/vmstat.c                   |    1 +
>  4 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
> index 2b1cef8..dd2600d 100644
> --- a/include/linux/vm_event_item.h
> +++ b/include/linux/vm_event_item.h
> @@ -57,6 +57,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN,
> PSWPOUT,  #ifdef CONFIG_HUGETLB_PAGE
>               HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,  #endif
> +             NR_OOM_VICTIMS,
>               UNEVICTABLE_PGCULLED,   /* culled to noreclaim list */
>               UNEVICTABLE_PGSCANNED,  /* scanned for reclaimability */
>               UNEVICTABLE_PGRESCUED,  /* rescued from noreclaim list */
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 03b612b..802b8a1 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -570,6 +570,7 @@ void oom_kill_process(struct oom_control *oc, struct
> task_struct *p,
>        * space under its control.
>        */
>       do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
> +     count_vm_event(NR_OOM_VICTIMS);
>       mark_oom_victim(victim);
>       pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-
> rss:%lukB\n",
>               task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
> @@ -600,6 +601,7 @@ void oom_kill_process(struct oom_control *oc, struct
> task_struct *p,
>                               task_pid_nr(p), p->comm);
>                       task_unlock(p);
>                       do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
> +                     count_vm_event(NR_OOM_VICTIMS);
>               }
>       rcu_read_unlock();
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9bcfd70..fafb09d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2761,7 +2761,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned
> int order,
>               schedule_timeout_uninterruptible(1);
>               return NULL;
>       }
> -
>       /*
>        * Go through the zonelist yet one more time, keep very high watermark
>        * here, this is only to catch a parallel oom killing, we must fail if
diff --git
> a/mm/vmstat.c b/mm/vmstat.c index 1fd0886..8503a2e 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -808,6 +808,7 @@ const char * const vmstat_text[] = {
>       "htlb_buddy_alloc_success",
>       "htlb_buddy_alloc_fail",
>  #endif
> +     "nr_oom_victims",
>       "unevictable_pgs_culled",
>       "unevictable_pgs_scanned",
>       "unevictable_pgs_rescued",
> --
> 1.7.9.5

Regards,
Pintu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to