On 12/10/2015 04:00 AM, deeepdish wrote:
> Hello,
> 
> I encountered a strange issue when rebuilding monitors reusing same
> hostnames, however different IPs.
> 
> Steps to reproduce:
> 
> - Build monitor using ceph-deploy create mon <hostname1>
> - Remove monitor
> via http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/ 
> <http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/>
> (remove monitor) ? I didn?t realize there was a ceph-deploy mon destroy
> command at this point.
> - Build a new monitor on same hardware using ceph-deploy create mon
> <hostname1a>  # reason = to rename / change IP of monitor as per above link
> - Monitor ends up in probing mode.   When connecting via the admin
> socket, I see that there are no peers avail.   
> 
> The above behavior of only when reinstalling monitors.   I even tried
> reinstalling the OS, however there?s a monmap embedded somewhere causing
> the previous monitor hostnames / IPs to conflict with the new monitor?s
> peering ability.  

> [root@b02s08 ~]# 
> 
> On a reinstalled (not working) monitor:
> 
> sudo ceph --cluster=ceph --admin-daemon
> /var/run/ceph/ceph-mon.smg01.asok mon_status
> {
>    "name": "smg01",
>    "rank": 0,
>    "state": "probing",
>    "election_epoch": 0,
>    "quorum": [],
>    "outside_quorum": [
>        "smg01"
>    ],
>    "extra_probe_peers": [
>        "10.20.1.8:6789\/0",
>        "10.20.10.14:6789\/0",
>        "10.20.10.16:6789\/0",
>        "10.20.10.18:6789\/0",
>        "10.20.10.251:6789\/0",
>        "10.20.10.252:6789\/0"
>    ],
[snip]
> }


> 
> This appears to be consistent with a wrongly populated 'mon_host' and
> 'mon_initial_members' in your ceph.conf.
> 
>  -Joao



Thanks Joao.   I had a look but my other 3 monitors are working just fine.   To 
be clear, I’ve confirmed the same behaviour on other monitor nodes that have 
been removed from the cluster and rebuild with a new IP (however same name).

[global]
fsid = (hidden)
mon_initial_members = smg01, smon01s, smon02s, b02s08
mon_host = 10.20.10.250, 10.20.10.251, 10.20.10.252, 10.20.1.8
public network = 10.20.10.0/24, 10.20.1.0/24
cluster network = 10.20.41.0/24

. . . 

[mon.smg01s]
#host = smg01s.erbus.kupsta.net
host = smg01s
addr = 10.20.10.250:6789

[mon.smon01s]
#host = smon01s.erbus.kupsta.net
host = smon01s
addr = 10.20.10.251:6789

[mon.smon02s]
#host = smon02s.erbus.kupsta.net
host = smon02s
addr = 10.20.10.252:6789

[mon.b02s08]
#host = b02s08.erbus.kupsta.net
host = b02s08
addr = 10.20.1.8:6789

# sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.smg01.asok 
mon_status
{
    "name": "smg01",
    "rank": 0,
    "state": "probing",
    "election_epoch": 0,
    "quorum": [],
    "outside_quorum": [
        "smg01"
    ],
    "extra_probe_peers": [
        "10.20.1.8:6789\/0",
        "10.20.10.251:6789\/0",
        "10.20.10.252:6789\/0"
    ],
    "sync_provider": [],
    "monmap": {
        "epoch": 0,
        "fsid": “(hidden)",
        "modified": "0.000000",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "smg01",
                "addr": "10.20.10.250:6789\/0"
            },
            {
                "rank": 1,
                "name": "smon01s",
                "addr": "0.0.0.0:0\/1"
            },
            {
                "rank": 2,
                "name": "smon02s",
                "addr": "0.0.0.0:0\/2"
            },
            {
                "rank": 3,
                "name": "b02s08",
                "addr": "0.0.0.0:0\/3"
            }
        ]
    }
}

Processes running on the monitor node that’s in probing state:

# ps -ef | grep ceph
root      1140     1  0 Dec11 ?        00:05:07 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01
root      6406     1  0 Dec11 ?        00:05:10 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01
root      7712     1  0 Dec11 ?        00:05:09 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01
root      9105     1  0 Dec11 ?        00:05:11 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01
root     13098 30548  0 07:18 pts/1    00:00:00 grep --color=auto ceph
root     14243     1  0 Dec11 ?        00:05:09 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01
root     31222     1  0 05:39 ?        00:00:00 /bin/bash -c ulimit -n 32768; 
/usr/bin/ceph-mon -i smg01 --pid-file /var/run/ceph/mon.smg01.pid -c 
/etc/ceph/ceph.conf --cluster ceph -f
root     31226 31222  1 05:39 ?        00:01:39 /usr/bin/ceph-mon -i smg01 
--pid-file /var/run/ceph/mon.smg01.pid -c /etc/ceph/ceph.conf --cluster ceph -f
root     31228     1  0 05:39 pts/1    00:00:15 python 
/usr/sbin/ceph-create-keys --cluster ceph -i smg01

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to