Hi Jim,

I've got a few questions for you as it looks like we have a similar cluster for 
our ceph infrastructure. A quick overview of what we have. We are also running 
a small cluster  of 3 storage nodes (30 osds in total) and 5 clients over 
40gig/s infiniband link (ipoib). Ever since installing the cluster (back in 
2013) we have had issues with ceph stability. During the upgrade cycles (ceph 
version upgrades were applied to practically all ceph stable releases, 
including major and minor versions) the stability has varied from improving to 
some degree to being poor once again. 

The main problem that we had (up until release 10.2.x) were slow requests and 
osds being marked as down due to heartbeat. I gave up having spent tons of time 
trying to figure out the cause of the problem with folks on irc, they were 
blaming the networking issue. However, I couldn't confirm this and it doesn't 
seem to be the case. I have ran about a doze of different networking tests for 
months and none of them showed any degradation in speed, packet loss, etc. I 
even tested the initiation of around 1000 tcp connections per second over the 
course of months and not had a single packet drop or unusual delay. While the 
network tests were running the ceph cluster was still producing slow requests 
and osds being marked as down due to heartbeats. The quoted figure of 10K+ per 
year for support is not an option for us, so we ended up biting the bullet.

After the recent upgrade to 10.2.x branch, we started to face additional issues 
of osds either crashing or being killed due to the lack of memory. My guess is 
the memory leaks. Now, I think we are approaching the limit to our suffering 
with ceph and are currently investigating an alternative solution as ceph has 
proved to be unstable and unfortunately, the community support did not help to 
resolve our problems during 4 years period.

I was hoping to have some insight on your setup and configuration on both the 
client and ceph backend and also learn more about the problems you are having 
or had in the past and managed to address? Would you be willing to discuss this 
further?

Many thanks

Andrei

----- Original Message -----
> From: "Jim Kilborn" <j...@kilborns.com>
> To: "Joao Eduardo Luis" <j...@suse.de>, "ceph-users" 
> <ceph-users@lists.ceph.com>
> Sent: Thursday, 9 February, 2017 13:04:16
> Subject: Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4

> Joao,
> 
> Here is the information requested. Thanks for taking a look. Note that the 
> below
> is after I restarted the ceph-mon processes yesterday. If this is not
> acceptable, I will have to wait until the issue reappears. This is on a small
> cluster. 4 ceph nodes, and 6 ceph kernel clients running over infiniband.
> 
> 
> 
> [root@empire-ceph02 log]# ceph -s
> 
>    cluster 62ed97d6-adf4-12e4-8fd5-3d9701b22b87
> 
>     health HEALTH_OK
> 
>     monmap e3: 3 mons at
>     
> {empire-ceph01=192.168.20.241:6789/0,empire-ceph02=192.168.20.242:6789/0,empire-ceph03=192.168.20.243:6789/0}
> 
>            election epoch 56, quorum 0,1,2 
> empire-ceph01,empire-ceph02,empire-ceph03
> 
>      fsmap e526: 1/1/1 up {0=empire-ceph03=up:active}, 1 up:standby
> 
>     osdmap e361: 32 osds: 32 up, 32 in
> 
>            flags sortbitwise,require_jewel_osds
> 
>      pgmap v2427955: 768 pgs, 2 pools, 2370 GB data, 1759 kobjects
> 
>            7133 GB used, 109 TB / 116 TB avail
> 
>                 768 active+clean
> 
>  client io 256 B/s wr, 0 op/s rd, 0 op/s wr
> 
> 
> 
> [root@empire-ceph02 log]# ceph daemon mon.empire-ceph02 ops
> 
> {
> 
>    "ops": [],
> 
>    "num_ops": 0
> 
> }
> 
> 
> 
> [root@empire-ceph02 mon]# du -sh ceph-empire-ceph02
> 
> 30M     ceph-empire-ceph02
> 
> 
> 
> [root@empire-ceph02 mon]# ls -lR
> 
> .:
> 
> total 0
> 
> drwxr-xr-x. 3 ceph ceph 46 Dec  6 14:26 ceph-empire-ceph02
> 
> 
> 
> ./ceph-empire-ceph02:
> 
> total 8
> 
> -rw-r--r--. 1 ceph ceph    0 Dec  6 14:26 done
> 
> -rw-------. 1 ceph ceph   77 Dec  6 14:26 keyring
> 
> drwxr-xr-x. 2 ceph ceph 4096 Feb  9 06:58 store.db
> 
> 
> 
> ./ceph-empire-ceph02/store.db:
> 
> total 30056
> 
> -rw-r--r--. 1 ceph ceph  396167 Feb  9 06:06 510929.sst
> 
> -rw-r--r--. 1 ceph ceph  778898 Feb  9 06:56 511298.sst
> 
> -rw-r--r--. 1 ceph ceph 5177344 Feb  9 07:01 511301.log
> 
> -rw-r--r--. 1 ceph ceph 1491740 Feb  9 06:58 511305.sst
> 
> -rw-r--r--. 1 ceph ceph 2162405 Feb  9 06:58 511306.sst
> 
> -rw-r--r--. 1 ceph ceph 2162047 Feb  9 06:58 511307.sst
> 
> -rw-r--r--. 1 ceph ceph 2104201 Feb  9 06:58 511308.sst
> 
> -rw-r--r--. 1 ceph ceph 2146113 Feb  9 06:58 511309.sst
> 
> -rw-r--r--. 1 ceph ceph 2123659 Feb  9 06:58 511310.sst
> 
> -rw-r--r--. 1 ceph ceph 2162927 Feb  9 06:58 511311.sst
> 
> -rw-r--r--. 1 ceph ceph 2129640 Feb  9 06:58 511312.sst
> 
> -rw-r--r--. 1 ceph ceph 2133590 Feb  9 06:58 511313.sst
> 
> -rw-r--r--. 1 ceph ceph 2143906 Feb  9 06:58 511314.sst
> 
> -rw-r--r--. 1 ceph ceph 2158434 Feb  9 06:58 511315.sst
> 
> -rw-r--r--. 1 ceph ceph 1649589 Feb  9 06:58 511316.sst
> 
> -rw-r--r--. 1 ceph ceph      16 Feb  8 13:42 CURRENT
> 
> -rw-r--r--. 1 ceph ceph       0 Dec  6 14:26 LOCK
> 
> -rw-r--r--. 1 ceph ceph  983040 Feb  9 06:58 MANIFEST-503363
> 
> 
> 
> 
> 
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> 
> 
> 
> From: Joao Eduardo Luis<mailto:j...@suse.de>
> Sent: Thursday, February 9, 2017 3:06 AM
> To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] ceph-mon memory issue jewel 10.2.5 kernel 4.4
> 
> 
> 
> Hi Jim,
> 
> On 02/08/2017 07:45 PM, Jim Kilborn wrote:
>> I have had two ceph monitor nodes generate swap space alerts this week.
>> Looking at the memory, I see ceph-mon using a lot of memory and most of the 
>> swap
>> space. My ceph nodes have 128GB mem, with 2GB swap  (I know the memory/swap
>> ratio is odd)
>>
>> When I get the alert, I see the following
> [snip]
>> root@empire-ceph02 ~]# ps -aux | egrep 'ceph-mon|MEM'
>>
>> USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>>
>> ceph     174239  0.3 45.8 62812848 60405112 ?   Ssl   2016 269:08
>> /usr/bin/ceph-mon -f --cluster ceph --id empire-ceph02 --setuser ceph
>> --setgroup ceph
>>
>> [snip]
>>
>>
>> Is this a setting issue? Or Maybe a bug?
>> When I look at the other ceph-mon processes on other nodes, they aren’t using
>> any swap, and only about 500MB of memory.
> 
> Can you get us the result of `ceph -s`, of `ceph daemon mon.ID ops`, and
> the size of your monitor's data directory? The latter, ideally,
> recursive with the sizes of all the children in the tree (which,
> assuming they're a lot, would likely be better on a pastebin).
> 
>   -Joao
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to