Hi Chris, Thank for your answer. All the nodes are on AWS and I didn't change security group configuration.
2015-12-18 15:41 GMT+01:00 Chris Dunlop <ch...@onthe.net.au>: > Hi Reno, > > "Peering", as far as I understand it, is the osds trying to talk to each > other. > > You have approximately 1 OSD worth of pgs stuck (i.e. 264 / 8), and osd.0 > appears in each of the stuck pgs, alongside either osd.2 or osd.3. > > I'd start by checking the comms between osd.0 and osds 2 and 3 (including > the MTU). > > Cheers, > > Chris > > > On Fri, Dec 18, 2015 at 02:50:18PM +0100, Reno Rainz wrote: > > Hi all, > > > > I reboot all my osd node after, I got some pg stuck in peering state. > > > > root@ceph-osd-3:/var/log/ceph# ceph -s > > cluster 186717a6-bf80-4203-91ed-50d54fe8dec4 > > health HEALTH_WARN > > clock skew detected on mon.ceph-osd-2 > > 33 pgs peering > > 33 pgs stuck inactive > > 33 pgs stuck unclean > > Monitor clock skew detected > > monmap e1: 3 mons at {ceph-osd-1= > > > 10.200.1.11:6789/0,ceph-osd-2=10.200.1.12:6789/0,ceph-osd-3=10.200.1.13:6789/0 > > } > > election epoch 14, quorum 0,1,2 > ceph-osd-1,ceph-osd-2,ceph-osd-3 > > osdmap e66: 8 osds: 8 up, 8 in > > pgmap v1346: 264 pgs, 3 pools, 272 MB data, 653 objects > > 808 MB used, 31863 MB / 32672 MB avail > > 231 active+clean > > 33 peering > > root@ceph-osd-3:/var/log/ceph# > > > > > > root@ceph-osd-3:/var/log/ceph# ceph pg dump_stuck > > ok > > pg_stat state up up_primary acting acting_primary > > 4.2d peering [2,0] 2 [2,0] 2 > > 1.57 peering [3,0] 3 [3,0] 3 > > 1.24 peering [3,0] 3 [3,0] 3 > > 1.52 peering [0,2] 0 [0,2] 0 > > 1.50 peering [2,0] 2 [2,0] 2 > > 1.23 peering [3,0] 3 [3,0] 3 > > 4.54 peering [2,0] 2 [2,0] 2 > > 4.19 peering [3,0] 3 [3,0] 3 > > 1.4b peering [0,3] 0 [0,3] 0 > > 1.49 peering [0,3] 0 [0,3] 0 > > 0.17 peering [0,3] 0 [0,3] 0 > > 4.17 peering [0,3] 0 [0,3] 0 > > 4.16 peering [0,3] 0 [0,3] 0 > > 0.10 peering [0,3] 0 [0,3] 0 > > 1.11 peering [0,2] 0 [0,2] 0 > > 4.b peering [0,2] 0 [0,2] 0 > > 1.3c peering [0,3] 0 [0,3] 0 > > 0.c peering [0,3] 0 [0,3] 0 > > 1.3a peering [3,0] 3 [3,0] 3 > > 0.38 peering [2,0] 2 [2,0] 2 > > 1.39 peering [0,2] 0 [0,2] 0 > > 4.33 peering [2,0] 2 [2,0] 2 > > 4.62 peering [2,0] 2 [2,0] 2 > > 4.3 peering [0,2] 0 [0,2] 0 > > 0.6 peering [0,2] 0 [0,2] 0 > > 0.4 peering [2,0] 2 [2,0] 2 > > 0.3 peering [2,0] 2 [2,0] 2 > > 1.60 peering [0,3] 0 [0,3] 0 > > 0.2 peering [3,0] 3 [3,0] 3 > > 4.6 peering [3,0] 3 [3,0] 3 > > 1.30 peering [0,3] 0 [0,3] 0 > > 1.2f peering [0,2] 0 [0,2] 0 > > 1.2a peering [3,0] 3 [3,0] 3 > > root@ceph-osd-3:/var/log/ceph# > > > > > > root@ceph-osd-3:/var/log/ceph# ceph osd tree > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT > PRIMARY-AFFINITY > > -9 4.00000 root default > > -8 4.00000 region eu-west-1 > > -6 2.00000 datacenter eu-west-1a > > -2 2.00000 host ceph-osd-1 > > 0 1.00000 osd.0 up 1.00000 > 1.00000 > > 1 1.00000 osd.1 up 1.00000 > 1.00000 > > -4 2.00000 host ceph-osd-3 > > 4 1.00000 osd.4 up 1.00000 > 1.00000 > > 5 1.00000 osd.5 up 1.00000 > 1.00000 > > -7 2.00000 datacenter eu-west-1b > > -3 2.00000 host ceph-osd-2 > > 2 1.00000 osd.2 up 1.00000 > 1.00000 > > 3 1.00000 osd.3 up 1.00000 > 1.00000 > > -5 2.00000 host ceph-osd-4 > > 6 1.00000 osd.6 up 1.00000 > 1.00000 > > 7 1.00000 osd.7 up 1.00000 > 1.00000 > > root@ceph-osd-3:/var/log/ceph# > > > > Do you have guys any idea ? Why they stay in this state ? > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com