Hi,
I am running a 3-node Deis cluster with ceph as underlying FS. So it is
ceph running inside Docker containers running in three separate servers.
I rebooted all three nodes (almost at once). After rebooted, the ceph
monitor refuse to connect to each other.
Symptoms are:
- no quorum formed,
- ceph admin socket file does not exist
- only the following in ceph log:
Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700
0 -- :/1000021 >> 10.132.183.191:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1
pgs=0 cs=0 l=1
c=0x7f5ce4029930).fault
Dec 14 16:38:44 deis-1 sh[933]: 2014-12-14 08:38:44.265419 7f5cec71f700
0 -- :/1000021 >> 10.132.183.192:6789/0 pipe(0x7f5ce40296a0 sd=4 :0 s=1
pgs=0 cs=0 l=1
c=0x7f5ce4029930).fault
Dec 14 16:38:50 deis-1 sh[933]: 2014-12-14 08:38:50.267398 7f5cec71f700
0 -- :/1000021 >> 10.132.183.190:6789/0 pipe(0x7f5cd40030e0 sd=4 :0 s=1
pgs=0 cs=0 l=1
c=0x7f5cd4003370).fault
...keep repeating...
This is *my /etc/ceph/ceph.conf file*:
[global]
fsid = cc368515-9dc6-48e2-9526-58ac4cbb3ec9
mon initial members = deis-3
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd pool default size = 3
osd pool default min_size = 1
osd pool default pg_num = 128
osd pool default pgp_num = 128
osd recovery delay start = 15
log file = /dev/stdout
[mon.deis-3]
host = deis-3
mon addr = 10.132.183.190:6789
[mon.deis-1]
host = deis-1
mon addr = 10.132.183.191:6789
[mon.deis-2]
host = deis-2
mon addr = 10.132.183.192:6789
[client.radosgw.gateway]
host = deis-store-gateway
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
log file = /dev/stdout
*IP table of the docker host:**
*core@deis-3 ~ $ sudo iptables --list
Chain INPUT (policy DROP)
target prot opt source destination
Firewall-INPUT all -- anywhere anywhere
Chain FORWARD (policy DROP)
target prot opt source destination
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:http
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:https
ACCEPT tcp -- anywhere 172.17.0.2 tcp dpt:2222
ACCEPT all -- anywhere anywhere ctstate
RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Firewall-INPUT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain Firewall-INPUT (2 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
ACCEPT icmp -- anywhere anywhere icmp echo-reply
ACCEPT icmp -- anywhere anywhere icmp
destination-unreachable
ACCEPT icmp -- anywhere anywhere icmp time-exceeded
ACCEPT icmp -- anywhere anywhere icmp echo-request
ACCEPT all -- anywhere anywhere ctstate
RELATED,ESTABLISHED
ACCEPT all -- 10.132.183.190 anywhere
ACCEPT all -- 10.132.183.192 anywhere
ACCEPT all -- 10.132.183.191 anywhere
ACCEPT all -- anywhere anywhere
ACCEPT tcp -- anywhere anywhere ctstate NEW multiport
dports ssh,2222,http,https
LOG all -- anywhere anywhere LOG level warning
REJECT all -- anywhere anywhere reject-with
icmp-host-prohibited
All private IPs are ping-gable within the ceph monitor container. What
could I do next to troubleshoot this issue?
Thanks a lot!
- Jimmy Chu
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com