Hi all,

my monitor3 is not able to rejoin the cluster (containing mon1, mon2 and mon3 - 
running stable emperor).
I try to recreate/inject a new monmap to all 3 mon's - but only mon1 and mon2 
are up and joined.

Now, enabling debugging on mon3, I got the following:

2014-01-30 08:51:03.823669 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3 
handle_probe_reply mon.1 
c7b12656-15a6-41b0-963f-4f47c62497dc name ceph-mon2 quorum 0,1 paxos( fc 1 lc 
160 )) v5
2014-01-30 08:51:03.823678 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  monmap 
is e3: 3 mons at 
2014-01-30 08:51:03.823701 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  peer 
name is mon.ceph-mon2
2014-01-30 08:51:03.823706 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  
existing quorum 0,1
2014-01-30 08:51:03.823708 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  peer 
paxos version 160 vs my version 154 (ok)
2014-01-30 08:51:03.823711 7f39b3f56700 10 mon.ceph-mon3@2(probing) e3  ready 
to join, but i'm not in the monmap or my addr is blank, trying to join

But why mon3 ("but i'm not in the monmap") is not in the monmap ?

Checking the sources 
-->         if (monmap->contains(name) &&
-->             !monmap->get_addr(name).is_blank_ip()) {
              // i'm part of the cluster; just initiate a new election
            } else {
              dout(10) << " ready to join, but i'm not in the monmap or my addr 
is blank, trying to join" << dendl;
              messenger->send_message(new MMonJoin(monmap->fsid, name, 

My map on mon3 looks like

root@ceph-mon3:/var/log/ceph# ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph-mon3.asok mon_status
{ "name": "ceph-mon3",
  "rank": 2,
  "state": "probing",
  "election_epoch": 0,
  "quorum": [],
  "outside_quorum": [],
  "extra_probe_peers": [],
  "sync_provider": [],
  "monmap": { "epoch": 3,
      "fsid": "c7b12656-15a6-41b0-963f-4f47c62497dc",
      "modified": "2014-01-30 08:27:28.808771",
      "created": "2014-01-30 08:27:28.808771",
      "mons": [
            { "rank": 0,
              "name": "mon.ceph-mon1",
              "addr": "\/0"},
            { "rank": 1,
              "name": "mon.ceph-mon2",
              "addr": "\/0"},
            { "rank": 2,
              "name": "mon.ceph-mon3",
              "addr": "\/0"}]}}

So, the condition "(monmap->contains(name) && 
!monmap->get_addr(name).is_blank_ip())" should work, or ? But the 
start_election() is not called.

Can somebody help me here ?


More infos to mon3:

root@ceph-mon3:/var/log/ceph# hostname -a

root@ceph-mon3:/var/log/ceph# netstat -tulpen | grep ceph-mon
        tcp        0      0*               
LISTEN      0          635369      2164/ceph-mon   

root@ceph-mon3:/var/log/ceph# cat /etc/hosts       localhost  ceph-mon3.dtnet.de      ceph-mon3

admin@ceph-admin:~/cluster1$ ceph -s
    cluster c7b12656-15a6-41b0-963f-4f47c62497dc
     health HEALTH_WARN 192 pgs degraded; 192 pgs stale; 192 pgs stuck stale; 
192 pgs stuck unclean; 1 mons down, quorum 0,1 ceph-mon1,ceph-mon2
     monmap e3: 3 mons at 
 election epoch 230, quorum 0,1 ceph-mon1,ceph-mon2
     osdmap e28: 1 osds: 1 up, 1 in
      pgmap v38: 192 pgs, 3 pools, 0 bytes data, 0 objects
            36388 kB used, 3724 GB / 3724 GB avail
                 192 stale+active+degraded

Attachment: smime.p7s
Description: S/MIME cryptographic signature

ceph-users mailing list

Reply via email to