On Tue, 10 Apr 2018 12:49:02 +0200 Michal Hocko <mho...@kernel.org> wrote:
> But you do realize that what you are proposing is by no means any safer, > don't you? The memory allocated for the ring buffer is _not_ accounted > to any process and as such it is not considered by the oom killer when > picking up an oom victim so you are quite likely to pick up an innocent > process to be killed. So basically you are risking an allocation runaway > completely hidden from the OOM killer. Now, the downside of the patch is > that the OOM_SCORE_ADJ_MIN task might get killed which is something that > shouldn't happen because it is a contract. I would call this an > unsolvable problem and a inherent broken design of the oom disabled > task. So far I haven't heard a single _argument_ why supporting such a > weird cornercase is desirable when your application can trivial do > > fork(); set_oom_score_adj(); exec("echo $VAR > $RINGBUFFER_FILE") We could do this as a compromise: diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index c9cb9767d49b..40c2e0a56c51 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -1185,6 +1185,12 @@ static int __rb_allocate_pages(long nr_pages, struct list_head *pages, int cpu) */ mflags = GFP_KERNEL | __GFP_RETRY_MAYFAIL; + /* If we can't OOM this task, then only allocate without reclaim */ + if (unlikely(current->signal->oom_score_adj == OOM_SCORE_ADJ_MIN)) { + mflags = GFP_KERNEL | __GFP_NORETRY; + user_thread = false; /* do not set oom_origin */ + } + /* * If a user thread allocates too much, and si_mem_available() * reports there's enough memory, even though there is not. This way, if one sets OOM_SCORE_ADJ_MIN, then we wont set oom_origin for the task, but we also wont try hard to allocate memory if there is nothing immediately available. -- Steve