Hello Gr. Stefan,
1.
osd marked noout,nobackfill,norecover before shutting down .
$ ceph osd set noout
$ ceph osd set nobackfill
$ ceph osd set norecover
2.
[root@ceph-node1 ~]# systemctl status firewalld
?? firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service;
disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
[root@ceph-node1 ~]# netstat -antp|grep 6789
tcp 0 0 192.168.1.6:6789
0.0.0.0:*
LISTEN 474841/ceph-mon
[root@ceph-node1 ~]# netstat -antp|grep 3300
[root@ceph-node1 ~]#
3. osd mds mgr log is empty!
[root@ceph-node1 ceph]# ls -lh *.log
-rw------- 1 ceph ceph 0 Dec 11 03:09 ceph.audit.log
-rw------- 1 ceph ceph 3.7K Dec 11 08:36 ceph.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-mds.ceph-node1.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-mgr.ceph-node1.log
-rw-r--r-- 1 ceph ceph 2.2M Dec 11 14:42 ceph-mon.ceph-node1.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.0.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.10.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.11.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.1.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.2.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.3.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.4.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.5.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.6.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.7.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.8.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19 ceph-osd.9.log
-rw-r--r--. 1 ceph ceph 0 Dec 9 03:19
ceph-rgw-ceph-node1.rgw0.log
-rw-r--r-- 1 root root 0 Dec 11 03:09 ceph-volume.log
-rw-r--r-- 1 root root 0 Dec 11 03:09 ceph-volume-systemd.log
4.[root@ceph-node1 ceph]# ceph -s
just blocked ...
error 111 after a few hours
------------------ ???????? ------------------
??????: "Stefan Kooman"<ste...@bit.nl>;
????????: 2019??12??11??(??????) ????2:37
??????: "Cc??"<o...@qq.com>;
????: "ceph-users"<ceph-users@lists.ceph.com>;
????: Re: [ceph-users] ceph-mon is blocked after shutting down and ip
address changed
Quoting Cc?? (o...@qq.com):
> ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus
(stable)
>
> os :CentOS Linux release 7.7.1908 (Core)
> single node ceph cluster with 1 mon,1mgr,1 mds,1rgw and 12osds , but
only&nbsp; cephfs is used.
> &nbsp;ceph -s&nbsp; &nbsp;is blocked after&nbsp; shutting
down the machine (192.168.0.104), then ip address changed to&nbsp;
192.168.1.6
>
> &nbsp;I created the monmap with monmap tool and&nbsp; update the
ceph.conf , hosts file and then start ceph-mon.
> and the ceph-mon&nbsp; log:
> ...
> 2019-12-11 08:57:45.170 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1285.14s
> 2019-12-11 08:57:50.170 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1290.14s
> 2019-12-11 08:57:55.171 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1295.14s
> 2019-12-11 08:58:00.171 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1300.14s
> 2019-12-11 08:58:05.172 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1305.14s
> 2019-12-11 08:58:10.171 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1310.14s
> 2019-12-11 08:58:15.173 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1315.14s
> 2019-12-11 08:58:20.173 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1320.14s
> 2019-12-11 08:58:25.174 7f952cdac700&nbsp; 1
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr:
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state:
up:active) since 1325.14s
>
> ...
>
>
> I changed IP back to 192.168.0.104 yeasterday, but all the same.
Just checking here: do you run a firewall? Is port 3300 open (besides
6789)?
What do you see in the logs on the MDS and the ODSs? There are timers
configured in the MON / OSD in case they cannot reach (in time) each
other. OSDs might get marked out. But I'm unsure what is the status of
your cluster. Could you paste a "ceph -s"?
Gr. Stefan
P.s. BTW: is this running production?
--
| BIT BV https://www.bit.nl/
Kamer van Koophandel 09090351
| GPG:
0xD14839C6
+31 318 648 688 / i...@bit.nl
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com