Re: [ceph-users] ceph-mon cpu usage

2015-07-30 Thread Quentin Hartman
avior when the servers are not in time sync. > Check your ntp settings > > Dieter > > From: ceph-users on behalf of Quentin > Hartman > Date: Wednesday, July 29, 2015 at 5:47 PM > To: Luis Periquito > Cc: Ceph Users > Subject: Re: [ceph-users] ceph-mon cpu usage

Re: [ceph-users] ceph-mon cpu usage

2015-07-30 Thread Spillmann, Dieter
ailto:periqu...@gmail.com>> Cc: Ceph Users mailto:ceph-users@lists.ceph.com>> Subject: Re: [ceph-users] ceph-mon cpu usage I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied t

Re: [ceph-users] ceph-mon cpu usage

2015-07-29 Thread Quentin Hartman
I just had my ceph cluster exhibit this behavior (two of three mons eat all CPU, cluster becomes unusably slow) which is running 0.87.1 It seems to be tied to deep scrubbing, as the behavior almost immediately surfaces if that is turned on, but if it is off the behavior eventually seems to return

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Luis Periquito
I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were sending massive amounts of auth requests to the monitors, seeming to overwhelm them. Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd the box and all of the disks, reinstalled and guess what? They are

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Kjetil Jørgensen
It sounds slightly similar to what I just experienced. I had one monitor out of three, which seemed to essentially run one core at full tilt continuously, and had it's virtual address space allocated at the point where top started calling it Tb. Requests hitting this monitor did not get very timel

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Luis Periquito
The leveldb is smallish: around 70mb. I ran debug mon = 10 for a while, but couldn't find any interesting information. I would run out of space quite quickly though as the log partition only has 10g. On 24 Jul 2015 21:13, "Mark Nelson" wrote: > On 07/24/2015 02:31 PM, Luis Periquito wrote: > >>

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Mark Nelson
On 07/24/2015 02:31 PM, Luis Periquito wrote: Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to request

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Luis Periquito
Now it's official, I have a weird one! Restarted one of the ceph-mons with jemalloc and it didn't make any difference. It's still using a lot of cpu and still not freeing up memory... The issue is that the cluster almost stops responding to requests, and if I restart the primary mon (that had al

Re: [ceph-users] ceph-mon cpu usage

2015-07-24 Thread Jan Schermer
You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be possible to do something like LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd … The last time we tried it segfaulted after a few minutes, so YMMV and be careful. Jan > On 23 Jul 2015, at 18:18, Luis

Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Luis Periquito
Hi Greg, I've been looking at the tcmalloc issues, but did seem to affect osd's, and I do notice it in heavy read workloads (even after the patch and increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This is affecting the mon process though. looking at perf top I'm getting most of the

Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Gregory Farnum
On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito wrote: > The ceph-mon is already taking a lot of memory, and I ran a heap stats > > MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application > MALLOC: + 27597135872 (26318.7 MiB) Bytes in page

Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Luis Periquito
The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7 MiB) Bytes in page heap freelist MALLOC: + 16598552 ( 15.8 MiB) Bytes in cen

Re: [ceph-users] ceph-mon cpu usage

2015-07-22 Thread Luis Periquito
This cluster is server RBD storage for openstack, and today all the I/O was just stopped. After looking in the boxes ceph-mon was using 17G ram - and this was on *all* the mons. Restarting the main one just made it work again (I restarted the other ones because they were using a lot of ram). This h