>> I am just curious... can you reproduce the problem reliably? If yes, can you
>> try
>> the patch below ? Just in case, this is not the real fix in any case...
>
> Yes. It deterministically results in hung processes in vanilla kernel.
> I'll try this patch.
I'll have to correct this. I can repr
> Well, but we can't do this. And "as expected" is actually just wrong. I still
> think that the whole FAULT_FLAG_USER logic is not right. This needs another
> email.
I meant as expected from the content of the patch :) I think
Konstantin agrees that this patch cannot be merged upstream.
> fork(
> Yep. Bug still not fixed in upstream. In our kernel I've plugged it with
> this:
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2808,8 +2808,9 @@ asmlinkage __visible void schedule_tail(struct
> task_struct *prev)
> balance_callback(rq);
> preempt_enable();
>
> -
>> With strace, when running 500 concurrent mem-hog tasks on the same
>> kernel, 33 of them failed with:
>>
>> strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
>> `THREAD_GETMEM (self, tid) != ppid' failed.
>>
>> Which is: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
>> And discu
>> Could you post the stack trace of the hung oom victim? Also could you
>> post the full kernel log?
With strace, when running 500 concurrent mem-hog tasks on the same
kernel, 33 of them failed with:
strace: ../sysdeps/nptl/fork.c:136: __libc_fork: Assertion
`THREAD_GETMEM (self, tid) != ppid' f
>
> Could you post the stack trace of the hung oom victim? Also could you
> post the full kernel log?
Here is the stack of the process that lives (it is *not* the
oom-victim) in a run with 100 processes and *without* strace:
# cat /proc/7688/stack
[] futex_wait_queue_me+0xc2/0x120
[] futex_wait+0
I came across the following issue in kernel 3.16 (Ubuntu 14.04) which
was then reproduced in kernels 4.4 LTS:
After a couple of of memcg oom-kills in a cgroup, a syscall in
*another* process in the same cgroup hangs indefinitely.
Reproducing:
# mkdir -p strace_run
# mkdir /sys/fs/cgroup/memory/1
On Sun, Nov 1, 2015 at 12:25 PM, Richard Weinberger
wrote:
> On Sat, Oct 24, 2015 at 11:54 PM, Shayan Pooya wrote:
>> I noticed the following core_pattern behavior in my linux box while
>> running docker containers. I am not sure if it is bug, but it is
>> inconsiste
I noticed the following core_pattern behavior in my linux box while
running docker containers. I am not sure if it is bug, but it is
inconsistent and not documented.
If the core_pattern is set on the host, the containers will observe
and use the pattern for dumping cores (there is no per cgroup
co
> Pretty good write up that, sad you did not Cc the guy.
>
> I got defeated by the github web shite (again!) and could not locate an
> email address for him :( Ah.. Google to the rescue!
>
Thanks Peter, actually he *did* submit the patch:
https://lkml.org/lkml/2014/10/24/456
Therefore, I think com
>From 64a24d04c6510dcc144aba123fb21ed6f895c6b7 Mon Sep 17 00:00:00 2001
From: Shayan Pooya
Date: Mon, 14 Sep 2015 21:25:09 -0700
Subject: [PATCH] sched/fair: adjust the depth of a sched_entity when its
parent changes
Fixes commit fed14d45f945 ("sched/fair: Track cgroup depth")
Hi
11 matches
Mail list logo