On Thu, Aug 29, 2019 at 11:44 AM Qian Cai wrote:
>
> On Thu, 2019-08-29 at 09:09 -0700, Edward Chron wrote:
>
> > > Feel like you are going in circles to "sell" without any new information.
> > > If
> > > you
> > > need to deal with
On Thu, Aug 29, 2019 at 9:18 AM Michal Hocko wrote:
>
> On Thu 29-08-19 08:03:19, Edward Chron wrote:
> > On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote:
> [...]
> > > Or simply provide a hook with the oom_control to be called to report
> > > without replac
On Thu, Aug 29, 2019 at 8:42 AM Qian Cai wrote:
>
> On Thu, 2019-08-29 at 08:03 -0700, Edward Chron wrote:
> > On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote:
> > >
> > > On Thu 29-08-19 19:14:46, Tetsuo Handa wrote:
> > > > On 2019/08/29 16:11, Mic
On Thu, Aug 29, 2019 at 7:09 AM Tetsuo Handa
wrote:
>
> On 2019/08/29 20:56, Michal Hocko wrote:
> >> But please be aware that, I REPEAT AGAIN, I don't think neither eBPF nor
> >> SystemTap will be suitable for dumping OOM information. OOM situation means
> >> that even single page fault event can
On Thu, Aug 29, 2019 at 12:11 AM Michal Hocko wrote:
>
> On Wed 28-08-19 12:46:20, Edward Chron wrote:
> [...]
> > Our belief is if you really think eBPF is the preferred mechanism
> > then move OOM reporting to an eBPF.
>
> I've said that all this additional i
On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote:
>
> On Thu 29-08-19 19:14:46, Tetsuo Handa wrote:
> > On 2019/08/29 16:11, Michal Hocko wrote:
> > > On Wed 28-08-19 12:46:20, Edward Chron wrote:
> > >> Our belief is if you really think eBPF is the preferr
On Wed, Aug 28, 2019 at 1:04 PM Edward Chron wrote:
>
> On Wed, Aug 28, 2019 at 3:12 AM Tetsuo Handa
> wrote:
> >
> > On 2019/08/28 16:08, Michal Hocko wrote:
> > > On Tue 27-08-19 19:47:22, Edward Chron wrote:
> > >> For production systems instal
On Wed, Aug 28, 2019 at 1:18 PM Qian Cai wrote:
>
> On Wed, 2019-08-28 at 12:46 -0700, Edward Chron wrote:
> > But with the caveat that running a eBPF script that it isn't standard Linux
> > operating procedure, at this point in time any way will not be well
> >
On Wed, Aug 28, 2019 at 3:12 AM Tetsuo Handa
wrote:
>
> On 2019/08/28 16:08, Michal Hocko wrote:
> > On Tue 27-08-19 19:47:22, Edward Chron wrote:
> >> For production systems installing and updating EBPF scripts may someday
> >> be very common, but I wonder how data
On Tue, Aug 27, 2019 at 6:32 PM Qian Cai wrote:
>
>
>
> > On Aug 27, 2019, at 9:13 PM, Edward Chron wrote:
> >
> > On Tue, Aug 27, 2019 at 5:50 PM Qian Cai wrote:
> >>
> >>
> >>
> >>> On Aug 27, 2019, at 8:23 PM, Edward Chron w
On Tue, Aug 27, 2019 at 5:50 PM Qian Cai wrote:
>
>
>
> > On Aug 27, 2019, at 8:23 PM, Edward Chron wrote:
> >
> >
> >
> > On Tue, Aug 27, 2019 at 5:40 AM Qian Cai wrote:
> > On Mon, 2019-08-26 at 12:36 -0700, Edward Chron wrote:
> > >
On Tue, Aug 27, 2019 at 12:15 AM Michal Hocko wrote:
>
> On Mon 26-08-19 12:36:28, Edward Chron wrote:
> [...]
> > Extensibility using OOM debug options
> > -
> > What is needed is an extensible system to optionally configure
> &g
ample Output:
-
Sample Tasks Summary message output:
Aug 13 18:52:48 yoursystem kernel: Threads: 492 Processes: 248
forks_since_boot: 7786 procs_runable: 4 procs_iowait: 0
Signed-off-by: Edward Chron
---
mm/Kconfig.debug| 16
mm/oom_kill_debug.c
alObj
ObjSize AlignSize Objs/Slab Pgs/Slab ActiveSlab TotalSlab Slab_Name
Aug 13 18:52:47 mysrvr kernel: 403412 1613 1648
224 256161103103 skbuff_head..
Signed-off-by: Edward Chron
---
mm/Kconfig.debug| 15 ++
output (minsize = 0.1% of totalpages):
Aug 13 20:16:30 yourserver kernel: Summary: OOM Tasks considered:245
printed:33 minimum size:32576kB total-pages:32579084kB
Signed-off-by: Edward Chron
---
include/linux/oom.h | 1 +
mm/Kconfig.debug| 34 ++
mm/oom_ki
Aug 6 09:37:21 egc103 kernel: [ 7707]7553 10383 10383
7707 S 0.132 0.350 1056804 1054040 1052796
2092 0 0 1944 684 1052860
136 4 0 0 0 0
0 1000 oomprocs
Signed-off-by: E
print line output:
Jul 22 20:16:09 yoursystem kernel: Vmalloc size=2625536 pages=640
caller=__do_sys_swapon+0x78e/0x1130
Sample summary print line output:
Jul 22 19:03:26 yoursystem kernel: Summary: Vmalloc entries examined:1070
printed:989 minsize:0kB
Signed-off-by: Edward Chron
---
in
: 368 entries: 6 lastFlush: 1720s
hGrows: 0 allocs: 7 destroys: 1 lookups: 0 hits: 0
resFailed: 0 gcRuns/Forced: 110 / 0 tblFull: 0 proxyQlen: 0
Signed-off-by: Edward Chron
Cc: "David S. Miller"
Cc: net...@vger.kernel.org
---
include/net/neighbour.h |
s set to enabled.
Sample Output
-
There is no change to the standard OOM output with this option other than
the stanrd Linux OOM report Unreclaimable info is output for every OOM
Event, not just OOM Events where slab usage exceeds user process memory
usage.
Signed-off-by: Edward
23 23:26:34 yoursystem kernel: Slabs Total: 151212kB Reclaim: 50632kB
Unreclaim: 100580kB
Signed-off-by: Edward Chron
---
mm/Kconfig.debug| 30 +
mm/oom_kill.c | 11 +++-
mm/oom_kill_debug.c | 42 +
mm/oom_kill_debug.h | 4 +++
m
ize value in the appropriate tenthpercent file
as needed.
---------
Edward Chron (10):
mm/oom_debug: Add Debug base code
mm/oom_debug: Add System State Summary
mm/oom_debug: Add Tasks Summary
mm/oom_debug: Add ARP and ND Table Summary usage
mm/oom_debug: A
and enabled. By default each configured Select Print OOM debug option
has a default print limiting minimum entry size of 10 or 1% of memory.
-
Signed-off-by: Edward Chron
---
mm/Kconfig.debug| 17 +++
mm/Makefile
mmary message:
Jul 27 10:56:46 yoursystem kernel: System Uptime:0 days 00:17:27
CPUs:4 Machine:x86_64 Node:yoursystem Domain:localdomain
Kernel Release:5.3.0-rc2+ Version: #49 SMP Mon Jul 27 10:35:32 PDT 2019
Signed-off-by: Edward Chron
---
mm/Kconfig.debug| 15 +
mm/oom_kill_debug.c
process
was correctly targeted by OOM due to the miconfiguration. This can
be quite helpful for triage and problem determination.
The addition of the pgtables_bytes shows page table usage by the
process and is a useful measure of the memory size of the process.
Signed-off-by: Edward Chron
Acked-by
On Wed, Aug 21, 2019 at 12:19 AM David Rientjes wrote:
>
> On Wed, 21 Aug 2019, Michal Hocko wrote:
>
> > > vm.oom_dump_tasks is pretty useful, however, so it's curious why you
> > > haven't left it enabled :/
> >
> > Because it generates a lot of output potentially. Think of a workload
> > with t
On Thu, Aug 22, 2019 at 12:09 AM Michal Hocko wrote:
>
> On Wed 21-08-19 15:25:13, Edward Chron wrote:
> > On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote:
> > >
> > > On Tue, 20 Aug 2019, Edward Chron wrote:
> > >
> > > > For an OOM
On Thu, Aug 22, 2019 at 12:21 AM Michal Hocko wrote:
>
> On Wed 21-08-19 16:12:08, Edward Chron wrote:
> [...]
> > Additionally (which you know, but mentioning for reference) the OOM
> > output used to look like this:
> >
> > Nov 14 15:23:48 oldserver kernel: [3
On Thu, Aug 22, 2019 at 12:15 AM Michal Hocko wrote:
>
> On Wed 21-08-19 15:22:07, Edward Chron wrote:
> > On Wed, Aug 21, 2019 at 12:19 AM David Rientjes wrote:
> > >
> > > On Wed, 21 Aug 2019, Michal Hocko wrote:
> > >
> > > > > vm.oom_dum
certainly reassuring.
My understanding now is that printing the oom_score is discouraged.
This seems unfortunate. The oom_score_adj can be adjusted
appropriately if oom_score is known.
So It would be useful to have both.
But at least if oom_score_adj is printed you can confirm the value at
On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote:
>
> On Tue, 20 Aug 2019, Edward Chron wrote:
>
> > For an OOM event: print oom_score_adj value for the OOM Killed process to
> > document what the oom score adjust value was at the time the process was
> > OOM Kille
enabled.
I don't see why that is a big deal.
It is very useful to have all the information that is there.
Wouldn't mind also having pgtables too but we would be able to get
that from the output of dump_task if that is enabled.
If it is acceptable to also add the dump_task for the killed process
for !sysctl_oom_dump_tasks I can repost the patch including that as
well.
Thank-you,
Edward Chron
Arista Networks
e and if you prefer a fresh submission, let me know and
I'll do that.
Thank-you for reviewing this patch.
-Edward Chron
Arista Networks
On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote:
>
> On Tue, 20 Aug 2019, Edward Chron wrote:
>
> > For an OOM event: print oom_sc
targeted by OOM due to the miconfiguration. Having
the oom_score_adj on the Killed message ensures that it is documented.
Signed-off-by: Edward Chron
Acked-by: Michal Hocko
---
mm/oom_kill.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index
On Mon, Aug 12, 2019 at 4:42 AM Michal Hocko wrote:
>
> On Fri 09-08-19 15:15:18, Edward Chron wrote:
> [...]
> > So it is optimal if you only have to go and find the correct log and search
> > or run your script(s) when you absolutely need to, not on every OOM even
output:
Aug 14 23:00:02 testserver kernel: Out of memory: Killed process 2692
(oomprocs) total-vm:1056800kB, anon-rss:1052760kB, file-rss:4kB,i
shmem-rss:0kB oom_score_adj:1000
Signed-off-by: Edward Chron
---
mm/oom_kill.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a
Sorry about top posting, responses inline.
On Thu, Aug 8, 2019 at 11:40 PM Michal Hocko wrote:
>
> [Again, please do not top post - it makes a mess of any longer
> discussion]
>
> On Thu 08-08-19 15:15:12, Edward Chron wrote:
> > In our experience far more (99.9%+) OOM
PM Michal Hocko wrote:
>
> [please do not top-post]
>
> On Thu 08-08-19 12:21:30, Edward Chron wrote:
> > It is helpful to the admin that looks at the kill message and records this
> > information. OOMs can come in bunches.
> > Knowing how much resource the oom selecte
output:
Jul 21 20:07:48 yoursystem kernel: Out of memory: Killed process 2826
(processname) total-vm:1056800kB, anon-rss:1052784kB, file-rss:4kB,
shmem-rss:0kB memory-usage:3.2% oom_score:1032 oom_score_adj:1000
total-pages: 32791748kB
Signed-off-by: Edward Chron
---
fs/proc/base.c | 2
38 matches
Mail list logo