Re: [ceph-users] How to repair active+clean+inconsistent?

K.C. Wong Sun, 11 Nov 2018 23:03:01 -0800

Thanks, Ashley.

Should I expect the deep-scrubbing to start immediately?


[root@mgmt01 ~]# ceph pg deep-scrub 1.65
instructing pg 1.65 on osd.62 to deep-scrub
[root@mgmt01 ~]# ceph pg ls deep_scrub
pg_stat objects mip     degr    misp    unf     bytes   log     disklog state   
state_stamp     v       reported        up      up_primary      acting  
acting_primary  last_scrub      scrub_stamp     last_deep_scrub deep_scrub_stamp
16.75   430657  0       0       0       0       30754735820     3007    3007    
active+clean+scrubbing+deep     2018-11-11 11:05:11.572325      39934'549067    
39934:1311893   [4,64,35]       4       [4,64,35]       4       28743'539264    
2018-11-07 02:17:53.293336      28743'539264    2018-11-03 14:39:44.837702
16.86   430617  0       0       0       0       30316842298     3048    3048    
active+clean+scrubbing+deep     2018-11-11 15:56:30.148527      39934'548012    
39934:1038058   [18,2,62]       18      [18,2,62]       18      26347'529815    
2018-10-28 01:06:55.526624      26347'529815    2018-10-28 01:06:55.526624
16.eb   432196  0       0       0       0       30612459543     3071    3071    
active+clean+scrubbing+deep     2018-11-11 11:02:46.993022      39934'550340    
39934:3662047   [56,44,42]      56      [56,44,42]      56      28507'540255    
2018-11-02 03:28:28.013949      28507'540255    2018-11-02 03:28:28.013949
16.f3   431399  0       0       0       0       30672009253     3067    3067    
active+clean+scrubbing+deep     2018-11-11 17:40:55.732162      39934'549240    
39934:2212192   [69,82,6]       69      [69,82,6]       69      28743'539336    
2018-11-02 17:22:05.745972      28743'539336    2018-11-02 17:22:05.745972
16.f7   430885  0       0       0       0       30796505272     3100    3100    
active+clean+scrubbing+deep     2018-11-11 22:50:05.231599      39934'548910    
39934:683169    [59,63,119]     59      [59,63,119]     59      28743'539167    
2018-11-03 07:24:43.776341      26347'530830    2018-10-28 04:44:12.276982
16.14c  430565  0       0       0       0       31177011073     3042    3042    
active+clean+scrubbing+deep     2018-11-11 20:11:31.107313      39934'550564    
39934:1545200   [41,12,70]      41      [41,12,70]      41      28743'540758    
2018-11-03 23:04:49.155741      28743'540758    2018-11-03 23:04:49.155741
16.156  430356  0       0       0       0       31021738479     3006    3006    
active+clean+scrubbing+deep     2018-11-11 20:44:14.019537      39934'549241    
39934:2958053   [83,47,1]       83      [83,47,1]       83      28743'539462    
2018-11-04 14:46:56.890822      28743'539462    2018-11-04 14:46:56.890822
16.19f  431613  0       0       0       0       30746145827     3063    3063    
active+clean+scrubbing+deep     2018-11-11 19:06:40.693002      39934'549429    
39934:1189872   [14,54,37]      14      [14,54,37]      14      28743'539660    
2018-11-04 18:25:13.225962      26347'531345    2018-10-28 20:08:45.286421
16.1b1  431225  0       0       0       0       30988996529     3048    3048    
active+clean+scrubbing+deep     2018-11-11 20:12:35.367935      39934'549604    
39934:778127    [34,106,11]     34      [34,106,11]     34      26347'531560    
2018-10-27 16:49:46.944748      26347'531560    2018-10-27 16:49:46.944748
16.1e2  431724  0       0       0       0       30247732969     3070    3070    
active+clean+scrubbing+deep     2018-11-11 20:55:17.591646      39934'550105    
39934:1428341   [103,48,3]      103     [103,48,3]      103     28743'540270    
2018-11-06 03:36:30.531106      28507'539840    2018-11-02 01:08:23.268409
16.1f3  430604  0       0       0       0       30633545866     3039    3039    
active+clean+scrubbing+deep     2018-11-11 20:15:28.557464      39934'548804    
39934:1354817   [66,102,33]     66      [66,102,33]     66      28743'538896    
2018-11-04 04:59:33.118414      28743'538896    2018-11-04 04:59:33.118414
[root@mgmt01 ~]# ceph pg ls inconsistent
pg_stat objects mip     degr    misp    unf     bytes   log     disklog state   
state_stamp     v       reported        up      up_primary      acting  
acting_primary  last_scrub      scrub_stamp     last_deep_scrub deep_scrub_stamp
1.65    12806   0       0       0       0       30010463024     3008    3008    
active+clean+inconsistent       2018-11-10 00:16:43.965966      39934'184512    
39934:388820    [62,67,47]      62      [62,67,47]      62      28743'183853    
2018-11-04 01:31:27.042458      28743'183853    2018-11-04 01:31:27.042458

It’s similar to when I issued “ceph pg repair 1.65”, instructing
osd.62 to repair 1.65, and then nothing seems to happen.

-kc

K.C. Wong
kcw...@verseon.com <mailto:kcw...@verseon.com>
M: +1 (408) 769-8235

-----------------------------------------------------
Confidentiality Notice:
This message contains confidential information. If you are not the
intended recipient and received this message in error, any use or
distribution is strictly prohibited. Please also notify us
immediately by return e-mail, and delete this message from your
computer system. Thank you.
-----------------------------------------------------
4096R/B8995EDE 
<https://sks-keyservers.net/pks/lookup?op=get&search=0x23A692E9B8995EDE>  E527 
CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
hkps://hkps.pool.sks-keyservers.net

> On Nov 11, 2018, at 10:22 PM, Ashley Merrick <singap...@amerrick.co.uk> wrote:
> 
> Your need to run "ceph pg deep-scrub 1.65" first
> 
> On Mon, Nov 12, 2018 at 2:20 PM K.C. Wong <kcw...@verseon.com 
> <mailto:kcw...@verseon.com>> wrote:
> Hi Brad,
> 
> I got the following:
> 
> [root@mgmt01 ~]# ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
> pg 1.65 is active+clean+inconsistent, acting [62,67,47]
> 1 scrub errors
> [root@mgmt01 ~]# rados list-inconsistent-obj 1.65
> No scrub information available for pg 1.65
> error 2: (2) No such file or directory
> [root@mgmt01 ~]# rados list-inconsistent-snapset 1.65
> No scrub information available for pg 1.65
> error 2: (2) No such file or directory
> 
> Rather odd output, I’d say; not that I understand what
> that means. I also tried ceph list-inconsistent-pg:
> 
> [root@mgmt01 ~]# rados lspools
> rbd
> cephfs_data
> cephfs_metadata
> .rgw.root
> default.rgw.control
> default.rgw.data.root
> default.rgw.gc
> default.rgw.log
> ctrl-p
> prod
> corp
> camp
> dev
> default.rgw.users.uid
> default.rgw.users.keys
> default.rgw.buckets.index
> default.rgw.buckets.data
> default.rgw.buckets.non-ec
> [root@mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg $i; 
> done
> []
> ["1.65"]
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> []
> 
> So, that’d put the inconsistency in the cephfs_data pool.
> 
> Thank you for your help,
> 
> -kc
> 
> K.C. Wong
> kcw...@verseon.com <mailto:kcw...@verseon.com>
> M: +1 (408) 769-8235
> 
> -----------------------------------------------------
> Confidentiality Notice:
> This message contains confidential information. If you are not the
> intended recipient and received this message in error, any use or
> distribution is strictly prohibited. Please also notify us
> immediately by return e-mail, and delete this message from your
> computer system. Thank you.
> -----------------------------------------------------
> 4096R/B8995EDE 
> <https://sks-keyservers.net/pks/lookup?op=get&search=0x23A692E9B8995EDE>  
> E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
> hkps://hkps.pool.sks-keyservers.net <>
> 
>> On Nov 11, 2018, at 5:43 PM, Brad Hubbard <bhubb...@redhat.com 
>> <mailto:bhubb...@redhat.com>> wrote:
>> 
>> What does "rados list-inconsistent-obj <pg>" say?
>> 
>> Note that you may have to do a deep scrub to populate the output.
>> On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong <kcw...@verseon.com 
>> <mailto:kcw...@verseon.com>> wrote:
>>> 
>>> Hi folks,
>>> 
>>> I would appreciate any pointer as to how I can resolve a
>>> PG stuck in “active+clean+inconsistent” state. This has
>>> resulted in HEALTH_ERR status for the last 5 days with no
>>> end in sight. The state got triggered when one of the drives
>>> in the PG returned I/O error. I’ve since replaced the failed
>>> drive.
>>> 
>>> I’m running Jewel (out of centos-release-ceph-jewel) on
>>> CentOS 7. I’ve tried “ceph pg repair <pg>” and it didn’t seem
>>> to do anything. I’ve tried even more drastic measures such as
>>> comparing all the files (using filestore) under that PG_head
>>> on all 3 copies and then nuking the outlier. Nothing worked.
>>> 
>>> Many thanks,
>>> 
>>> -kc
>>> 
>>> K.C. Wong
>>> kcw...@verseon.com <mailto:kcw...@verseon.com>
>>> M: +1 (408) 769-8235
>>> 
>>> -----------------------------------------------------
>>> Confidentiality Notice:
>>> This message contains confidential information. If you are not the
>>> intended recipient and received this message in error, any use or
>>> distribution is strictly prohibited. Please also notify us
>>> immediately by return e-mail, and delete this message from your
>>> computer system. Thank you.
>>> -----------------------------------------------------
>>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>>> hkps://hkps.pool.sks-keyservers.net <http://hkps.pool.sks-keyservers.net/>
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>> 
>> 
>> 
>> --
>> Cheers,
>> Brad
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

signature.asc
Description: Message signed with OpenPGP

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to repair active+clean+inconsistent?

Reply via email to