One of our tests last night failed in a weird way.  We started with a
three node cluster, with three monitors, expanded to a 5 node cluster
with 5 monitors and dropped back to a 4 node cluster with three

The sequence of events was:

start 3 monitors (monitors 0, 1, 2) - monmap e1
add one node
restart the 3 monitors
add another node
add monitor 4 - monmap e2
restart monitor 0
add monitor 3 - monmap e3
restart monitor 1
restart monitor 2
shutdown server with monitor 4 on it
remove monitor 4 - monmap e4
restart monitor 0
mon.0 had an odd time sync problem and respawned
stop monitor 3
remove monitor 3

At that point (08:23:52 in the log), ceph stopped responding (as if
quorum was lost).  Note that we do not see a new monmap (e5) created
by the removal of monitor 3.

See the (sort of) full log at:
ceph-users mailing list

Reply via email to