Another strange thing is that the last few (24) pg seems never get ready and stuck at creating (after 6 hours of waiting):
[root@serverA ~]# ceph -s 2015-03-30 17:14:48.720396 7feb5bd7a700 0 -- :/1000964 >> 10.???.78:6789/0 pipe(0x7feb60026120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7feb600263b0).fault cluster c09277a4-0eb9-41b1-b27f-a345c0169715 health HEALTH_WARN 24 pgs peering; 24 pgs stuck inactive; 24 pgs stuck unclean monmap e1: 2 mons at {mac0090fa6aaf7a=10.240.212.78:6789/0,mac0090fa6ab68a=10.???.80:6789/0}, election epoch 10, quorum 0,1 mac0090fa6aaf7a,mac0090fa6ab68a osdmap e102839: 22 osds: 22 up, 22 in pgmap v210270: 512 pgs, 1 pools, 0 bytes data, 0 objects 51633 MB used, 63424 GB / 63475 GB avail 24 creating+peering 488 active+clean And I cannot retrieve the file at ServerA, which I put into Ceph cluster at ServerB: [root@serverA ~]# rados -p test32 get test.txt test.txt 2015-03-30 17:15:44.014158 7f06951b6700 0 -- 10.???.80:0/1002224 >> 10.???.78:6867/29047 pipe(0x21e0f90 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x21e1220).fault 2015-03-30 17:16:36.066125 7f0694fb4700 0 -- 10.???.80:0/1002224 >> 10.????.78:6867/29047 pipe(0x7f068000d880 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f068000db10).fault It looks it just hang there forever. Is it waiting for all pg to be ready? Or the ceph cluster is at error state? ________________________________ From: Yueliang [yueliang9...@gmail.com] Sent: Monday, March 30, 2015 1:50 PM To: ceph-users@lists.ceph.com; Kai KH Huang Subject: RE: [ceph-users] Ceph osd is all up and in, but every pg is incomplete I think there no other way. :) -- Yueliang Sent with Airmail On March 30, 2015 at 13:17:55, Kai KH Huang (huangk...@lenovo.com<mailto:huangk...@lenovo.com>) wrote: Thanks for the quick response, and it seems to work! But what I expect to have is (replica number = 3) on two servers ( 1 host will store 2 copies, and the other store the 3rd one -- do deal with disk failure, rather only server failure). Is there a simple way to configure that, rather than building a custom CRUSH map? ________________________________ From: Yueliang [yueliang9...@gmail.com] Sent: Monday, March 30, 2015 12:04 PM To: ceph-users@lists.ceph.com; Kai KH Huang Subject: Re: [ceph-users] Ceph osd is all up and in, but every pg is incomplete Hi Kai KH ceph -s report "493 pgs undersized”, I guess you create the pool with default parameter size=3, but you only have two host, so there it not enough host two service the pool. you should add host or set size=2 when create pool or modify crush rule. -- Yueliang Sent with Airmail On March 30, 2015 at 11:16:38, Kai KH Huang (huangk...@lenovo.com<mailto:huangk...@lenovo.com>) wrote: Hi, all I'm a newbie to Ceph, and just setup a whole new Ceph cluster (0.87) with two servers. But when its status is always warning: [root@serverA ~]# ceph osd tree # id weight type name up/down reweight -1 62.04 root default -2 36.4 host serverA 0 3.64 osd.0 up 1 2 3.64 osd.2 up 1 1 3.64 osd.1 up 1 3 3.64 osd.3 up 1 4 3.64 osd.4 up 1 5 3.64 osd.5 up 1 6 3.64 osd.6 up 1 7 3.64 osd.7 up 1 8 3.64 osd.8 up 1 9 3.64 osd.9 up 1 -3 25.64 host serverB 10 3.64 osd.10 up 1 11 2 osd.11 up 1 12 2 osd.12 up 1 13 2 osd.13 up 1 14 2 osd.14 up 1 15 2 osd.15 up 1 16 2 osd.16 up 1 17 2 osd.17 up 1 18 2 osd.18 up 1 19 2 osd.19 up 1 20 2 osd.20 up 1 21 2 osd.21 up 1 [root@serverA ~]# ceph -s cluster ???????????????169715 health HEALTH_WARN 493 pgs degraded; 19 pgs peering; 19 pgs stuck inactive; 512 pgs stuck unclean; 493 pgs undersized monmap e1: 2 mons at {serverB=10.??????.78:6789/0,serverA=10.?????.80:6789/0}, election epoch 10, quorum 0,1 mac0090fa6aaf7a,mac0090fa6ab68a osdmap e92634: 22 osds: 22 up, 22 in pgmap v189018: 512 pgs, 1 pools, 0 bytes data, 0 objects 49099 MB used, 63427 GB / 63475 GB avail 493 active+undersized+degraded 19 creating+peering [root@serverA ~]# rados -p test31 ls 2015-03-30 09:57:18.607143 7f5251fcf700 0 -- :/1005913 >> 10.??????.78:6789/0 pipe(0x140a370 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x140a600).fault 2015-03-30 09:57:21.610994 7f52484ad700 0 -- 10.????.80:0/1005913 >> 10.????.78:6835/27111 pipe(0x140e010 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x140e2a0).fault 2015-03-30 10:02:21.650191 7f52482ab700 0 -- 10.????.80:0/1005913 >> 10.????78:6835/27111 pipe(0x7f5238016c80 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5238016f10).fault * serverA is 10.???.80, serverB is 10.????.78 * ntpdate is updated * I tried to remove the pool and re-create it, and clean up all objects inside, but no change at all * firewall are both shutoff Any clue is welcomed, thanks. _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com