Re: [ceph-users] All PGs are active+clean, still remapped PGs

Wido den Hollander Mon, 24 Oct 2016 22:18:57 -0700

> Op 24 oktober 2016 om 22:41 schreef David Turner 
> <david.tur...@storagecraft.com>:
> 
> 
> More to my curiosity on this.  Our clusters leave behind 
> /var/lib/ceph/osd/ceph-##/current/pg_temp folders on occasion.  if you check 
> all of the pg_temp folders for osd.10, you might find something that's 
> holding onto the pg even if it's really moved on.
>


Thanks, but osd.10 is already down and out. The disk has been broken for a 
while now.

Wido

> ________________________________
> 
> [cid:image8c937b.JPG@aa1a4c35.419d1b46]<https://storagecraft.com>       David 
> Turner | Cloud Operations Engineer | StorageCraft Technology 
> Corporation<https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2760 | Mobile: 385.224.2943
> 
> ________________________________
> 
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
> ________________________________
> 
> ________________________________
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of David 
> Turner [david.tur...@storagecraft.com]
> Sent: Monday, October 24, 2016 2:24 PM
> To: Wido den Hollander; ceph-us...@ceph.com
> Subject: Re: [ceph-users] All PGs are active+clean, still remapped PGs
> 
> 
> Are you running a replica size of 4?  If not, these might be errantly being 
> reported as being on 10.
> 
> ________________________________
> 
> [cid:imagedfab80.JPG@a622f997.4d830ea4]<https://storagecraft.com>       David 
> Turner | Cloud Operations Engineer | StorageCraft Technology 
> Corporation<https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2760 | Mobile: 385.224.2943
> 
> ________________________________
> 
> If you are not the intended recipient of this message or received it 
> erroneously, please notify the sender and delete it, together with any 
> attachments, and be advised that any dissemination or copying of this message 
> is prohibited.
> 
> ________________________________
> 
> ________________________________________
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den 
> Hollander [w...@42on.com]
> Sent: Monday, October 24, 2016 2:19 PM
> To: ceph-us...@ceph.com
> Subject: [ceph-users] All PGs are active+clean, still remapped PGs
> 
> Hi,
> 
> On a cluster running Hammer 0.94.9 (upgraded from Firefly) I have 29 remapped 
> PGs according to the OSDMap, but all PGs are active+clean.
> 
> osdmap e111208: 171 osds: 166 up, 166 in; 29 remapped pgs
> 
> pgmap v101069070: 6144 pgs, 2 pools, 90122 GB data, 22787 kobjects
>    264 TB used, 184 TB / 448 TB avail
>        6144 active+clean
> 
> The OSDMap shows:
> 
> root@mon1:~# ceph osd dump|grep pg_temp
> pg_temp 4.39 [160,17,10,8]
> pg_temp 4.52 [161,16,10,11]
> pg_temp 4.8b [166,29,10,7]
> pg_temp 4.b1 [5,162,148,2]
> pg_temp 4.168 [95,59,6,2]
> pg_temp 4.1ef [22,162,10,5]
> pg_temp 4.2c9 [164,95,10,7]
> pg_temp 4.330 [165,154,10,8]
> pg_temp 4.353 [2,33,18,54]
> pg_temp 4.3f8 [88,67,10,7]
> pg_temp 4.41a [30,59,10,5]
> pg_temp 4.45f [47,156,21,2]
> pg_temp 4.486 [138,43,10,7]
> pg_temp 4.674 [59,18,7,2]
> pg_temp 4.7b8 [164,68,10,11]
> pg_temp 4.816 [167,147,57,2]
> pg_temp 4.829 [82,45,10,11]
> pg_temp 4.843 [141,34,10,6]
> pg_temp 4.862 [31,160,138,2]
> pg_temp 4.868 [78,67,10,5]
> pg_temp 4.9ca [150,68,10,8]
> pg_temp 4.a83 [156,83,10,7]
> pg_temp 4.a98 [161,94,10,7]
> pg_temp 4.b80 [162,88,10,8]
> pg_temp 4.d41 [163,52,10,6]
> pg_temp 4.d54 [149,140,10,7]
> pg_temp 4.e8e [164,78,10,8]
> pg_temp 4.f2a [150,68,10,6]
> pg_temp 4.ff3 [30,157,10,7]
> root@mon1:~#
> 
> So I tried to restart osd.160 and osd.161, but that didn't chance the state.
> 
> root@mon1:~# ceph pg 4.39 query
> {
>    "state": "active+clean",
>    "snap_trimq": "[]",
>    "epoch": 111212,
>    "up": [
>        160,
>        17,
>        8
>    ],
>    "acting": [
>        160,
>        17,
>        8
>    ],
>    "actingbackfill": [
>        "8",
>        "17",
>        "160"
>    ],
> 
> In all these PGs osd.10 is involved, but that OSD is down and out. I tried 
> marking it as down again, but that didn't work.
> 
> I haven't tried removing osd.10 yet from the CRUSHMap since that will trigger 
> a rather large rebalance.
> 
> This cluster is still running with the Dumpling tunables though, so that 
> might be the issue. But before I trigger a very large rebalance I wanted to 
> check if there are any insights on this one.
> 
> Thanks,
> 
> Wido
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] All PGs are active+clean, still remapped PGs

Reply via email to