On Mon, Oct 14, 2019 at 01:40:19PM +0200, Harald Staub wrote:
> Probably same problem here. When I try to add another MON, "ceph
> health" becomes mostly unresponsive. One of the existing ceph-mon
> processes uses 100% CPU for several minutes. Tried it on 2 test
> clusters (14.2.4, 3 MONs, 5 storage nodes with around 2 hdd osds
> each). To avoid errors like "lease timeout", I temporarily increase
> "mon lease", from 5 to 50 seconds.
> 
> Not sure how bad it is from a customer PoV. But it is a problem by
> itself to be several minutes without "ceph health", when there is an
> increased risk of losing the quorum ...

Hi Harry,

thanks a lot for your reply! not sure we're experiencing the same issue,
i don't have it on any other cluster.. when this is happening to you, does
only ceph health stop working, or it also blocks all clients IO?

BR

nik


> 
>  Harry
> 
> On 13.10.19 20:26, Nikola Ciprich wrote:
> >dear ceph users and developers,
> >
> >on one of our production clusters, we got into pretty unpleasant situation.
> >
> >After rebooting one of the nodes, when trying to start monitor, whole cluster
> >seems to hang, including IO, ceph -s etc. When this mon is stopped again,
> >everything seems to continue. Traying to spawn new monitor leads to the same 
> >problem
> >(even on different node).
> >
> >I had to give up after minutes of outage, since it's unacceptable. I think 
> >we had this
> >problem once in the past on this cluster, but after some (but much shorter) 
> >time, monitor
> >joined and it worked fine since then.
> >
> >All cluster nodes are centos 7 machines, I have 3 monitors (so 2 are now 
> >running), I'm
> >using ceph 13.2.6
> >
> >Network connection seems to be fine.
> >
> >Anyone seen similar problem? I'd be very grateful for tips on how to debug 
> >and solve this..
> >
> >for those interested, here's log of one of running monitors with debug_mon 
> >set to 10/10:
> >
> >https://storage.lbox.cz/public/d258d0
> >
> >if I could provide more info, please let me know
> >
> >with best regards
> >
> >nikola ciprich
> >
> >
> >
> >
> >
> >
> >
> 

-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-------------------------------------

Attachment: pgp6aNC92DcqR.pgp
Description: PGP signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to