Alright. Thanks a lot for that information!

2017-02-06 14:35 GMT+01:00 Avi Kivity <a...@scylladb.com>:

> It is a bug.  In some contexts, the kernel needs to be able to reclaim
> memory instantly, but this is not one of them.  Here, the java process is
> creating a new thread, and the kernel is allocating 16kB for its kernel
> stack; that is a regular allocation, not atomic. If you decide the gfp_mask
> value you'll see that the kernel is allowed to initiate I/O and perform
> filesystem operations to satisfy the allocation, which it apparently did
> not.
>
>
> I do recommend reporting it, it will help others avoid encountering the
> same problem if it gets fixed.
>
> On 02/06/2017 03:07 PM, Benjamin Roth wrote:
>
> Thanks for the reply. We got rid of the OOMs by increasing
> vm.min_free_kbytes, it's default of approx 90mb is maybe a bit low for
> systems with 128GB.
> I guess the OOM happens because the kernel could not reclaim enough paged
> memory instantly.
> I can't tell if this is really a kernel bug or not. It also was my first
> thought but in the end the main thing is, it works again and it does with
> more mibn_free_kbytes
>
> 2017-02-06 11:53 GMT+01:00 Avi Kivity <a...@scylladb.com>:
>
>>
>> On 01/26/2017 07:36 AM, Benjamin Roth wrote:
>>
>> Hi there,
>>
>> We installed 2 new nodes these days. They run on ubuntu (Ubuntu 16.04.1
>> LTS) with kernel 4.4.0-59-generic. On these nodes (and only on these) CS
>> gets killed by the kernel due to OOM. It seems very strange to me because,
>> CS only takes roughly 20GB (out of 128GB), most of RAM is allocated to page
>> cache.
>>
>> Top looks typically like this:
>> KiB Mem : 13191691+total,  1974964 free, 20278184 used,
>> 10966376+buff/cache
>> KiB Swap:        0 total,        0 free,        0 used. 11051503+avail Mem
>>
>> This is what kern.log says:
>> https://gist.github.com/brstgt/0f1aa6afb558a56d1cadce958db46cf9
>>
>> Has anyone encountered sth like this before?
>>
>>
>> 2017-01-26T03:10:45.679458+00:00 cas10 kernel: [52226.449989] Node 0
>> Normal: 33850*4kB (UMEH) 8*8kB (UMH) 1*16kB (H) 0*32kB 0*64kB 0*128kB
>> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 135480kB
>> 2017-01-26T03:10:45.679460+00:00 cas10 kernel: [52226.449995] Node 1
>> Normal: 34213*4kB (UME) 176*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
>> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 138260kB
>>
>>
>> There is plenty of free memory left (33850+34213)*4kB = 270 MB, but it is
>> fragmented into 4k and 8k blocks, while the kernel is trying to allocate
>> 16kB.  Still, the kernel could have evicted some page cache or swapped out
>> anonymous memory.  You should report this to lkml, it is a kernel bug.
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>> <07161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>>
>>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
> <07161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

Reply via email to