Re: [ceph-users] Ceph osd is all up and in, but every pg is incomplete

Kai KH Huang Mon, 30 Mar 2015 02:30:08 -0700

Another strange thing is that the last few (24) pg seems never get ready and 
stuck at creating (after 6 hours of waiting):


[root@serverA ~]# ceph -s
2015-03-30 17:14:48.720396 7feb5bd7a700  0 -- :/1000964 >> 10.???.78:6789/0 
pipe(0x7feb60026120 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7feb600263b0).fault
    cluster c09277a4-0eb9-41b1-b27f-a345c0169715
     health HEALTH_WARN 24 pgs peering; 24 pgs stuck inactive; 24 pgs stuck 
unclean
     monmap e1: 2 mons at 
{mac0090fa6aaf7a=10.240.212.78:6789/0,mac0090fa6ab68a=10.???.80:6789/0}, 
election epoch 10, quorum 0,1 mac0090fa6aaf7a,mac0090fa6ab68a
     osdmap e102839: 22 osds: 22 up, 22 in
      pgmap v210270: 512 pgs, 1 pools, 0 bytes data, 0 objects
            51633 MB used, 63424 GB / 63475 GB avail
                  24 creating+peering
                 488 active+clean

And I cannot retrieve the file at ServerA, which I put into Ceph cluster at 
ServerB:

[root@serverA ~]# rados -p test32 get test.txt test.txt
2015-03-30 17:15:44.014158 7f06951b6700  0 -- 10.???.80:0/1002224 >> 
10.???.78:6867/29047 pipe(0x21e0f90 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x21e1220).fault

2015-03-30 17:16:36.066125 7f0694fb4700  0 -- 10.???.80:0/1002224 >> 
10.????.78:6867/29047 pipe(0x7f068000d880 sd=6 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f068000db10).fault

It looks it just hang there forever. Is it waiting for all pg to be ready? Or 
the ceph cluster is at error state?
________________________________
From: Yueliang [yueliang9...@gmail.com]
Sent: Monday, March 30, 2015 1:50 PM
To: ceph-users@lists.ceph.com; Kai KH Huang
Subject: RE: [ceph-users] Ceph osd is all up and in, but every pg is incomplete

I think there no other way. :)

--
Yueliang
Sent with Airmail


On March 30, 2015 at 13:17:55, Kai KH Huang 
(huangk...@lenovo.com<mailto:huangk...@lenovo.com>) wrote:

Thanks for the quick response, and it seems to work! But what I expect to have 
is (replica number = 3) on two servers ( 1 host will store 2 copies, and the 
other store the 3rd one -- do deal with disk failure, rather only server 
failure).  Is there a simple way to configure that, rather than building a 
custom CRUSH map?


________________________________
From: Yueliang [yueliang9...@gmail.com]
Sent: Monday, March 30, 2015 12:04 PM
To: ceph-users@lists.ceph.com; Kai KH Huang
Subject: Re: [ceph-users] Ceph osd is all up and in, but every pg is incomplete

Hi  Kai KH

ceph -s report "493 pgs undersized”, I guess you create the pool with default 
parameter size=3, but you only have two host, so there it not enough host two 
service the pool. you should add host or set size=2 when create pool or modify 
crush rule.

--
Yueliang
Sent with Airmail


On March 30, 2015 at 11:16:38, Kai KH Huang 
(huangk...@lenovo.com<mailto:huangk...@lenovo.com>) wrote:

Hi, all
    I'm a newbie to Ceph, and just setup a whole new Ceph cluster (0.87) with 
two servers. But when its status is always warning:

[root@serverA ~]# ceph osd tree
# id    weight  type name       up/down reweight
-1      62.04   root default
-2      36.4            host serverA
0       3.64                    osd.0   up      1
2       3.64                    osd.2   up      1
1       3.64                    osd.1   up      1
3       3.64                    osd.3   up      1
4       3.64                    osd.4   up      1
5       3.64                    osd.5   up      1
6       3.64                    osd.6   up      1
7       3.64                    osd.7   up      1
8       3.64                    osd.8   up      1
9       3.64                    osd.9   up      1
-3      25.64           host serverB
10      3.64                    osd.10  up      1
11      2                       osd.11  up      1
12      2                       osd.12  up      1
13      2                       osd.13  up      1
14      2                       osd.14  up      1
15      2                       osd.15  up      1
16      2                       osd.16  up      1
17      2                       osd.17  up      1
18      2                       osd.18  up      1
19      2                       osd.19  up      1
20      2                       osd.20  up      1
21      2                       osd.21  up      1


[root@serverA ~]# ceph -s
    cluster ???????????????169715
     health HEALTH_WARN 493 pgs degraded; 19 pgs peering; 19 pgs stuck 
inactive; 512 pgs stuck unclean; 493 pgs undersized
     monmap e1: 2 mons at 
{serverB=10.??????.78:6789/0,serverA=10.?????.80:6789/0}, election epoch 10, 
quorum 0,1 mac0090fa6aaf7a,mac0090fa6ab68a
     osdmap e92634: 22 osds: 22 up, 22 in
      pgmap v189018: 512 pgs, 1 pools, 0 bytes data, 0 objects
            49099 MB used, 63427 GB / 63475 GB avail
                 493 active+undersized+degraded
                  19 creating+peering

[root@serverA ~]# rados -p test31 ls
2015-03-30 09:57:18.607143 7f5251fcf700  0 -- :/1005913 >> 10.??????.78:6789/0 
pipe(0x140a370 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x140a600).fault
2015-03-30 09:57:21.610994 7f52484ad700  0 -- 10.????.80:0/1005913 >> 
10.????.78:6835/27111 pipe(0x140e010 sd=4 :0 s=1 pgs=0 cs=0 l=1 
c=0x140e2a0).fault
2015-03-30 10:02:21.650191 7f52482ab700  0 -- 10.????.80:0/1005913 >> 
10.????78:6835/27111 pipe(0x7f5238016c80 sd=5 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f5238016f10).fault

* serverA is 10.???.80, serverB is 10.????.78
* ntpdate is updated
* I tried to remove the pool and re-create it, and clean up all objects inside, 
but no change at all
* firewall are both shutoff

Any clue is welcomed, thanks.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph osd is all up and in, but every pg is incomplete

Reply via email to