Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-24 Thread Blair Bethwaite
On 25 January 2018 at 04:53, Warren Wang wrote: > The other thing I can think of is if you have OSDs locking up and getting > corrupted, there is a severe XFS bug where the kernel will throw a NULL > pointer dereference under heavy memory pressure. Again, it's due to memory > issues, but you wi

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-24 Thread Nick Fisk
7:54 > To: Blair Bethwaite > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] OSD servers swapping despite having free memory > capacity > > Forgot to mention another hint. If kswapd is constantly using CPU, and your > sar - > r ALL and sar -B stats look like it&

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-24 Thread Warren Wang
Forgot to mention another hint. If kswapd is constantly using CPU, and your sar -r ALL and sar -B stats look like it's trashing, kswapd is probably busy evicting things from memory in order to make a larger order allocation. The other thing I can think of is if you have OSDs locking up and getti

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Blair Bethwaite
+1 to Warren's advice on checking for memory fragmentation. Are you seeing kmem allocation failures in dmesg on these hosts? On 24 January 2018 at 10:44, Warren Wang wrote: > Check /proc/buddyinfo for memory fragmentation. We have some pretty severe > memory frag issues with Ceph to the point wh

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Warren Wang
Check /proc/buddyinfo for memory fragmentation. We have some pretty severe memory frag issues with Ceph to the point where we keep excessive min_free_kbytes configured (8GB), and are starting to order more memory than we actually need. If you have a lot of objects, you may find that you need to

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Marc Roos
to:linco...@uchicago.edu] Sent: dinsdag 23 januari 2018 21:13 To: Samuel Taylor Liston; ceph-users@lists.ceph.com Subject: Re: [ceph-users] OSD servers swapping despite having free memory capacity Hi Sam, What happens if you just disable swap altogether? i.e., with `swapoff -a` --Lincoln On Tu

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Lincoln Bryant
Hi Sam, What happens if you just disable swap altogether? i.e., with `swapoff -a` --Lincoln On Tue, 2018-01-23 at 19:54 +, Samuel Taylor Liston wrote: > We have a 9 - node (16 - 8TB OSDs per node) running jewel on centos > 7.4.  The OSDs are configured with encryption.  The cluster is > acce

[ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Samuel Taylor Liston
We have a 9 - node (16 - 8TB OSDs per node) running jewel on centos 7.4. The OSDs are configured with encryption. The cluster is accessed via two - RGWs and there are 3 - mon servers. The data pool is using 6+3 erasure coding. About 2 weeks ago I found two of the nine servers wedged and had