Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-28 Thread Craig Chi
Hi Brad, We fully understood the hardware we currently use are under Ceph's recommendation, so we are seeking for a method to lower or restrict the resources needed by OSD. Definitely losing some performance is acceptable for us. The reason why we did these experiments and discuss causes is th

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-28 Thread Craig Chi
Hi guys, Thanks to both of your suggestions, we had some progression on this issue. I tuned vm.min_free_kbytes to 16GB and raised vm.vfs_cache_pressure to 200, and I did observe that the OS keep releasing cache while the OSDs want more and more memory. OK. Now we are going to reproduce the han

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-25 Thread Nick Fisk
. From: Craig Chi [mailto:craig...@synology.com] Sent: 25 November 2016 01:46 To: Brad Hubbard Cc: Nick Fisk ; Ceph Users Subject: Re: [ceph-users] Ceph OSDs cause kernel unresponsive Hi Nick, I have seen the report before, if I understand correctly, the osd_map_cache_size generally

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Brad Hubbard
barrier set on mount options. Please >> please please understand the consequences of this option >> >> >> >> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf >> Of* Craig Chi >> *Sent:* 24 November 2016 10:37 >> *To:* Nick

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Craig Chi
Hi Brad, Thank you for your investigation. Here are the reasons of why we thought the abnormal Ceph behavior was caused by memory exhaustion. The following link redirect to the dmesg output on a toughly survived Ceph node.http://pastebin.com/Aa1FDd4K However I can not ensure that this is respo

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Craig Chi
Hi Nick, I have seen the report before, if I understand correctly, the osd_map_cache_size generally introduces a fixed amount of memory usage. We are using the default value of 200, and a single osd map I got from our cluster is 404KB. That is totally 404KB * 200 * 90 (osds) = about 7GB on eac

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Brad Hubbard
f this option > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Craig Chi > *Sent:* 24 November 2016 10:37 > *To:* Nick Fisk > *Cc:* ceph-users@lists.ceph.com > *Subject:* Re: [ceph-users] Ceph OSDs cause kernel unresponsiv

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Nick Fisk
10:37 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph OSDs cause kernel unresponsive Hi Nick, Thank you for your helpful information. I knew that Ceph recommends 1GB/1TB RAM, but we are not going to change the hardware architecture now. Are there any methods

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Craig Chi
Hi Nick, Thank you for your helpful information. I knew that Ceph recommends 1GB/1TB RAM, but we are not going to change the hardware architecture now. Are there any methods to set the resource limit one OSD can consume? And for your question, we currently set system configuration as: vm.swapp

Re: [ceph-users] Ceph OSDs cause kernel unresponsive

2016-11-24 Thread Nick Fisk
Hi Craig, From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Craig Chi Sent: 24 November 2016 08:34 To: ceph-users@lists.ceph.com Subject: [ceph-users] Ceph OSDs cause kernel unresponsive Hi Cephers, We have encountered kernel hanging issue on our Ceph cluster. Just