Re: [ceph-users] pgs stuck unclean in a pool without name

Cedric Lemarchand Fri, 18 Apr 2014 04:29:17 -0700

Hi,

Le 18/04/2014 13:14, Ирек Фасихов a écrit :
> Show command please: ceph osd tree.


Sure :

root@node1:~# ceph osd tree
# id    weight    type name    up/down    reweight
-1    3    root default
-2    3        host node1
0    1            osd.0    up    1   
1    1            osd.1    up    1   
2    1            osd.2    up    1   

>
> 2014-04-18 14:51 GMT+04:00 Cedric Lemarchand <ced...@yipikai.org
> <mailto:ced...@yipikai.org>>:
>
>     Hi,
>
>     I am facing a strange behaviour where a pool is stucked, I have no
>     idea how this pool appear in the cluster in the way I have not
>     played with pool creation, *yet*.
>
>     ##### root@node1:~# ceph -s
>         cluster 1b147882-722c-43d8-8dfb-38b78d9fbec3
>          health HEALTH_WARN 333 pgs degraded; 333 pgs stuck unclean;
>     pool .rgw.buckets has too few pgs
>          monmap e1: 1 mons at {node1=127.0.0.1:6789/0
>     <http://127.0.0.1:6789/0>}, election epoch 1, quorum 0 node1
>          osdmap e154: 3 osds: 3 up, 3 in
>           pgmap v16812: 3855 pgs, 14 pools, 41193 MB data, 24792 objects
>                 57236 MB used, 644 GB / 738 GB avail
>                     3522 active+clean
>                      333 active+degraded
>
>     ##### root@node1:/etc/ceph# ceph osd dump
>     epoch 154
>     fsid 1b147882-722c-43d8-8dfb-38b78d9fbec3
>     created 2014-04-16 20:46:46.516403
>     modified 2014-04-18 12:14:29.052231
>     flags
>
>     pool 0 'data' rep size 1 min_size 1 crush_ruleset 0 object_hash
>     rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
>     crash_replay_interval 45
>     pool 1 'metadata' rep size 1 min_size 1 crush_ruleset 1
>     object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
>     pool 2 'rbd' rep size 1 min_size 1 crush_ruleset 2 object_hash
>     rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0
>     pool 3 '.rgw.root' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 16 owner 0
>     pool 4 '.rgw.control' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 18 owner 0
>     pool 5 '.rgw' rep size 1 min_size 1 crush_ruleset 0 object_hash
>     rjenkins pg_num 333 pgp_num 333 last_change 20 owner 0
>     pool 6 '.rgw.gc' rep size 1 min_size 1 crush_ruleset 0 object_hash
>     rjenkins pg_num 333 pgp_num 333 last_change 21 owner 0
>     pool 7 '.users.uid' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 22 owner 0
>     pool 8 '.users' rep size 1 min_size 1 crush_ruleset 0 object_hash
>     rjenkins pg_num 333 pgp_num 333 last_change 26 owner 0
>     pool 9 '.users.swift' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 28 owner 0
>     pool 10 '.users.email' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 56 owner 0
>     pool 11 '.rgw.buckets.index' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 58 owner
>     18446744073709551615
>     pool 12 '.rgw.buckets' rep size 1 min_size 1 crush_ruleset 0
>     object_hash rjenkins pg_num 333 pgp_num 333 last_change 60 owner
>     18446744073709551615
>     pool 13 '' rep size 2 min_size 1 crush_ruleset 0 object_hash
>     rjenkins pg_num 333 pgp_num 333 last_change 146 owner
>     18446744073709551615
>
>     max_osd 5
>     osd.0 up   in  weight 1 up_from 151 up_thru 151 down_at 148
>     last_clean_interval [144,147) 192.168.1.18:6800/26681
>     <http://192.168.1.18:6800/26681> 192.168.1.18:6801/26681
>     <http://192.168.1.18:6801/26681> 192.168.1.18:6802/26681
>     <http://192.168.1.18:6802/26681> 192.168.1.18:6803/26681
>     <http://192.168.1.18:6803/26681> exists,up
>     f6f63e8a-42af-4dda-b523-ffb835165420
>     osd.1 up   in  weight 1 up_from 149 up_thru 149 down_at 148
>     last_clean_interval [139,147) 192.168.1.18:6805/26685
>     <http://192.168.1.18:6805/26685> 192.168.1.18:6806/26685
>     <http://192.168.1.18:6806/26685> 192.168.1.18:6807/26685
>     <http://192.168.1.18:6807/26685> 192.168.1.18:6808/26685
>     <http://192.168.1.18:6808/26685> exists,up
>     fa4689ac-e0ca-4ec3-ab2a-6afa57cc7498
>     osd.2 up   in  weight 1 up_from 153 up_thru 153 down_at 148
>     last_clean_interval [141,147) 192.168.1.18:6810/26691
>     <http://192.168.1.18:6810/26691> 192.168.1.18:6811/26691
>     <http://192.168.1.18:6811/26691> 192.168.1.18:6812/26691
>     <http://192.168.1.18:6812/26691> 192.168.1.18:6813/26691
>     <http://192.168.1.18:6813/26691> exists,up
>     6b2f7e3f-619c-4922-bdf9-bb0f2eee7413
>
>     ##### root@node1:/etc/ceph# ceph pg dump_stuck unclean |sort
>     13.0    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:28.438523    0'0    154:13    [0]    [0]   
>     0'0    2014-04-18 11:12:05.322855    0'0    2014-04-18 11:12:05.322855
>     13.100    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:26.110633    0'0    154:13    [0]    [0]   
>     0'0    2014-04-18 11:12:06.318159    0'0    2014-04-18 11:12:06.318159
>     13.10    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:37.081087    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:05.642317    0'0    2014-04-18 11:12:05.642317
>     13.1    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:20.874829    0'0    154:13    [1]    [1]   
>     0'0    2014-04-18 11:12:05.580874    0'0    2014-04-18 11:12:05.580874
>     13.101    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:16.723100    0'0    154:14    [1]    [1]   
>     0'0    2014-04-18 11:12:06.540975    0'0    2014-04-18 11:12:06.540975
>     13.102    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.795491    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.543846    0'0    2014-04-18 11:12:06.543846
>     13.103    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.809492    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.561542    0'0    2014-04-18 11:12:06.561542
>     13.104    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.817750    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.569706    0'0    2014-04-18 11:12:06.569706
>     13.105    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.840668    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.602826    0'0    2014-04-18 11:12:06.602826
>
>     [...]
>
>     13.f7    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:16.990648    0'0    154:14    [1]    [1]   
>     0'0    2014-04-18 11:12:06.483859    0'0    2014-04-18 11:12:06.483859
>     13.f8    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.947686    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.481459    0'0    2014-04-18 11:12:06.481459
>     13.f9    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.961392    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.505039    0'0    2014-04-18 11:12:06.505039
>     13.fa    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:17.062254    0'0    154:14    [1]    [1]   
>     0'0    2014-04-18 11:12:06.493605    0'0    2014-04-18 11:12:06.493605
>     13.fb    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:17.058748    0'0    154:14    [1]    [1]   
>     0'0    2014-04-18 11:12:06.526013    0'0    2014-04-18 11:12:06.526013
>     13.fc    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:26.277414    0'0    154:13    [0]    [0]   
>     0'0    2014-04-18 11:12:06.243714    0'0    2014-04-18 11:12:06.243714
>     13.fd    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:26.312618    0'0    154:13    [0]    [0]   
>     0'0    2014-04-18 11:12:06.263824    0'0    2014-04-18 11:12:06.263824
>     13.fe    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:35.977273    0'0    154:12    [2]    [2]   
>     0'0    2014-04-18 11:12:06.511879    0'0    2014-04-18 11:12:06.511879
>     13.ff    0    0    0    0    0    0    0    active+degraded   
>     2014-04-18 12:14:26.262810    0'0    154:13    [0]    [0]   
>     0'0    2014-04-18 11:12:06.289603    0'0    2014-04-18 11:12:06.289603
>     pg_stat    objects    mip    degr    unf    bytes    log   
>     disklog    state    state_stamp    v    reported    up   
>     acting    last_scrub    scrub_stamp    last_deep_scrub   
>     deep_scrub_stamp
>
>
>     ##### root@node1:~# rados df
>     pool name       category                 KB      objects      
>     clones     degraded      unfound           rd        rd
>     KB           wr        wr KB
>                     -                          0           
>     0            0            0           0            0           
>     0            0            0
>     .rgw            -                          1           
>     5            0            0           0           31          
>     23           17            6
>     .rgw.buckets    -                   42182267       
>     24733            0            0           0         4485       
>     17420       163372     50559394
>     .rgw.buckets.index -                          0           
>     3            0            0           0        47113      
>     105894        44735            0
>     .rgw.control    -                          0           
>     8            0            0           0            0           
>     0            0            0
>     .rgw.gc         -                          0          
>     32            0            0           0         7114        
>     7704         8524            0
>     .rgw.root       -                          1           
>     3            0            0           0           16          
>     10            3            3
>     .users          -                          1           
>     2            0            0           0            0           
>     0            2            2
>     .users.email    -                          1           
>     1            0            0           0            0           
>     0            1            1
>     .users.swift    -                          1           
>     2            0            0           0            5           
>     3            2            2
>     .users.uid      -                          1           
>     3            0            0           0           52          
>     46           16            6
>     data            -                          0           
>     0            0            0           0            0           
>     0            0            0
>     metadata        -                          0           
>     0            0            0           0            0           
>     0            0            0
>     rbd             -                          0           
>     0            0            0           0            0           
>     0            0            0
>       total used        58610648        24792
>       total avail      676160692
>       total space      774092940
>
>
>     The pool seams empty, so I have tried to removed it but the
>     command complain about the empty name. The last modification that
>     have been done was changing the "osd pool default size" in
>     ceph.conf from 1 to 2 and restart the whole cluster (mon + osd),
>     AFAICR the cluster was healtly before doing that.
>
>     This is a small bed test so every thing can be trashed, but I am
>     still a bit curious of what happens and how it could be fixed ?
>
>     Cheers
>
>     -- 
>     Cédric
>
>
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> -- 
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757

-- 
Cédric

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pgs stuck unclean in a pool without name

Reply via email to