Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-07 Thread Nico Schottelius
Good evening Joao, we double checked our MTUs, they are all 9200 on the servers and 9212 on the switches. And we have no problems transferring big files in general (as opennebula copies around images for importing, we do this quite a lot). So if you could have a look, it would be much appreciate

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Joao Eduardo Luis
On 10/04/2017 09:19 PM, Gregory Farnum wrote: Oh, hmm, you're right. I see synchronization starts but it seems to progress very slowly, and it certainly doesn't complete in that 2.5 minute logging window. I don't see any clear reason why it's so slow; it might be more clear if you could provide

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Gregory Farnum
Oh, hmm, you're right. I see synchronization starts but it seems to progress very slowly, and it certainly doesn't complete in that 2.5 minute logging window. I don't see any clear reason why it's so slow; it might be more clear if you could provide logs of the other logs at the same time (especial

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Nico Schottelius
Some more detail: when restarting the monitor on server1, it stays in synchronizing state forever. However the other two monitors change into electing state. I have double checked that there are not (host) firewalls active and that the times are within 1 second different of the hosts (they all

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Nico Schottelius
Hello Gregory, the logfile I produced has already debug mon = 20 set: [21:03:51] server1:~# grep "debug mon" /etc/ceph/ceph.conf debug mon = 20 It is clear that server1 is out of quorum, however how do we make it being part of the quorum again? I expected that the quorum finding process is tri

Re: [ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Gregory Farnum
You'll need to change the config so that it's running "debug mon = 20" for the log to be very useful here. It does say that it's dropping client connections because it's been out of quorum for too long, which is the correct behavior in general. I'd imagine that you've got clients trying to connect

[ceph-users] Luminous cluster stuck when adding monitor

2017-10-04 Thread Nico Schottelius
Good morning, we have recently upgraded our kraken cluster to luminous and since then noticed an odd behaviour: we cannot add a monitor anymore. As soon as we start a new monitor (server2), ceph -s and ceph -w start to hang. The situation became worse, since one of our staff stopped an existing