Hi all,

I have a cluster with 28 nodes (all physical, 4Cores, 32GB Ram), each node
has 4 OSDs for a total of 112 OSDs. Each OSD has 106 PGs (counted including
replication). There are 3 MONs on this cluster.
I'm running on Ubuntu trusty with kernel 3.13.0-52-generic, with Hammer
(0.94.2).

This cluster was installed with Hammer (0.94.1) and has only been upgraded
to the latest available version.

On the three mons one is mostly idle, one is using ~170% CPU, and one is
using ~270% CPU. They will change as I restart the process (usually the
idle one is the one with the lowest uptime).

Running a perf top againt the ceph-mon PID on the non-idle boxes it wields
something like this:

  4.62%  libpthread-2.19.so    [.] pthread_mutex_unlock
  3.95%  libpthread-2.19.so    [.] pthread_mutex_lock
  3.91%  libsoftokn3.so        [.] 0x000000000001db26
  2.38%  [kernel]              [k] _raw_spin_lock
  2.09%  libtcmalloc.so.4.1.2  [.] operator new(unsigned long)
  1.79%  ceph-mon              [.] DispatchQueue::enqueue(Message*, int,
unsigned long)
  1.62%  ceph-mon              [.] RefCountedObject::get()
  1.58%  libpthread-2.19.so    [.] pthread_mutex_trylock
  1.32%  libtcmalloc.so.4.1.2  [.] operator delete(void*)
  1.24%  libc-2.19.so          [.] 0x0000000000097fd0
  1.20%  ceph-mon              [.] ceph::buffer::ptr::release()
  1.18%  ceph-mon              [.] RefCountedObject::put()
  1.15%  libfreebl3.so         [.] 0x00000000000542a8
  1.05%  [kernel]              [k] update_cfs_shares
  1.00%  [kernel]              [k] tcp_sendmsg

The cluster is mostly idle, and it's healthy. The store is 69MB big, and
the MONs are consuming around 700MB of RAM.

Any ideas on this situation? Is it safe to ignore?
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to