We have a 9 - node (16 - 8TB OSDs per node) running jewel on centos 7.4.  The 
OSDs are configured with encryption.  The cluster is accessed via two - RGWs  
and there are 3 - mon servers.  The data pool is using 6+3 erasure coding.

About 2 weeks ago I found two of the nine servers wedged and had to hard power 
cycle them to get them back.  In this hard reboot 22 - OSDs came back with 
either a corrupted encryption or data partitions.  These OSDs were removed and 
recreated, and the resultant rebalance moved along just fine for about a week.  
At the end of that week two different nodes were unresponsive complaining of 
page allocation failures.  This is when I realized the nodes were heavy into 
swap.  These nodes were configured with 64GB of RAM as a cost saving going 
against the 1GB per 1TB recommendation.  We have since then doubled the RAM in 
each of the nodes giving each of them more than the 1GB per 1TB ratio.  

The issue I am running into is that these nodes are still swapping; a lot, and 
over time becoming unresponsive, or throwing page allocation failures.  As an 
example, “free” will show 15GB of RAM usage (out of 128GB) and 32GB of swap.  I 
have configured swappiness to 0 and and also turned up the vm.min_free_kbytes 
to 4GB to try to keep the kernel happy, and yet I am still filling up swap.  It 
only occurs when the OSDs have mounted partitions and ceph-osd daemons active. 

Anyone have an idea where this swap usage might be coming from? 
Thanks for any insight,

Sam Liston (sam.lis...@utah.edu)
====================================
Center for High Performance Computing
155 S. 1452 E. Rm 405
Salt Lake City, Utah 84112 (801)232-6932
====================================



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to