> Trying google with "ceph pg stuck in active and remapped" points to a couple > of post on this ML typically indicating that it's a problem with the CRUSH > map and ceph being unable to satisfy the mapping rules. Your ceph -s output > indicates that your using replication of size 3 in your pools. You also said > you had a custom CRUSH map - can you post it?
I’ve sent the file to you, since I’m not sure if it contains sensitive data. Yes I have replication of 3 and I did not customize the map by me. > I might be missing something here but I don't quite see how you come to this > statement. ceph osd df and ceph -s both show 16093 GB used and 39779 GB out > of 55872 GB available. The sum of the first 3 OSDs used space is, as you > stated, 6181 GB which is approx 38.4% so quite close to your target of 33% Maybe I have to explain it another way: Directly after finishing the backfill I received this output: health HEALTH_WARN 4 pgs stuck unclean recovery 1698/58476648 objects degraded (0.003%) recovery 418137/58476648 objects misplaced (0.715%) noscrub,nodeep-scrub flag(s) set monmap e9: 5 mons at {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0} election epoch 464, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5 osdmap e3086: 9 osds: 9 up, 9 in; 4 remapped pgs flags noscrub,nodeep-scrub pgmap v9928160: 320 pgs, 3 pools, 4809 GB data, 19035 kobjects 16093 GB used, 39779 GB / 55872 GB avail 1698/58476648 objects degraded (0.003%) 418137/58476648 objects misplaced (0.715%) 316 active+clean 4 active+remapped client io 757 kB/s rd, 1 op/s # ceph osd df ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR 0 1.28899 1.00000 3724G 1924G 1799G 51.67 1.79 1 1.57899 1.00000 3724G 2143G 1580G 57.57 2.00 2 1.68900 1.00000 3724G 2114G 1609G 56.78 1.97 3 6.78499 1.00000 7450G 1234G 6215G 16.57 0.58 4 8.39999 1.00000 7450G 1221G 6228G 16.40 0.57 5 9.51500 1.00000 7450G 1232G 6217G 16.54 0.57 6 7.66499 1.00000 7450G 1258G 6191G 16.89 0.59 7 9.75499 1.00000 7450G 2482G 4967G 33.33 1.16 8 9.32999 1.00000 7450G 2480G 4969G 33.30 1.16 TOTAL 55872G 16093G 39779G 28.80 MIN/MAX VAR: 0.57/2.00 STDDEV: 17.54 Here we can see, that the cluster is using 4809 GB data and has raw used 16093GB. Or the other way, only 39779G available. Two days later I saw: health HEALTH_WARN 4 pgs stuck unclean recovery 3486/58726035 objects degraded (0.006%) recovery 420024/58726035 objects misplaced (0.715%) noscrub,nodeep-scrub flag(s) set monmap e9: 5 mons at {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0} election epoch 478, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5 osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs flags noscrub,nodeep-scrub pgmap v9969059: 320 pgs, 3 pools, 4830 GB data, 19116 kobjects 15150 GB used, 40722 GB / 55872 GB avail 3486/58726035 objects degraded (0.006%) 420024/58726035 objects misplaced (0.715%) 316 active+clean 4 active+remapped # ceph osd df ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR 0 1.28899 1.00000 3724G 1696G 2027G 45.56 1.68 1 1.57899 1.00000 3724G 1705G 2018G 45.80 1.69 2 1.68900 1.00000 3724G 1794G 1929G 48.19 1.78 3 6.78499 1.00000 7450G 1239G 6210G 16.64 0.61 4 8.39999 1.00000 7450G 1226G 6223G 16.46 0.61 5 9.51500 1.00000 7450G 1237G 6212G 16.61 0.61 6 7.66499 1.00000 7450G 1263G 6186G 16.96 0.63 7 9.75499 1.00000 7450G 2493G 4956G 33.47 1.23 8 9.32999 1.00000 7450G 2491G 4958G 33.44 1.23 TOTAL 55872G 15150G 40722G 27.12 MIN/MAX VAR: 0.61/1.78 STDDEV: 13.54 As you can see now, we are using 4830 GB data BUT raw used is only 15150 GB or as said the other way, we have now 40722 GB free. You can see the change on the %USE of the osds. For me this looks like there is some data lost, since ceph did not do any backfill or other operation. That’s the problem... > Am 09.01.2017 um 21:55 schrieb Christian Wuerdig > <christian.wuer...@gmail.com>: > > > > On Tue, Jan 10, 2017 at 8:23 AM, Marcus Müller <mueller.mar...@posteo.de > <mailto:mueller.mar...@posteo.de>> wrote: > Hi all, > > Recently I added a new node with new osds to my cluster, which, of course > resulted in backfilling. At the end, there are 4 pgs left in the state 4 > active+remapped and I don’t know what to do. > > Here is how my cluster looks like currently: > > ceph -s > health HEALTH_WARN > 4 pgs stuck unclean > recovery 3586/58734009 objects degraded (0.006%) > recovery 420074/58734009 objects misplaced (0.715%) > noscrub,nodeep-scrub flag(s) set > monmap e9: 5 mons at > {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0 > > <http://192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0>} > election epoch 478, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5 > osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgs > flags noscrub,nodeep-scrub > pgmap v9970276: 320 pgs, 3 pools, 4831 GB data, 19119 kobjects > 15152 GB used, 40719 GB / 55872 GB avail > 3586/58734009 objects degraded (0.006%) > 420074/58734009 objects misplaced (0.715%) > 316 active+clean > 4 active+remapped > client io 643 kB/s rd, 7 op/s > > # ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR > 0 1.28899 1.00000 3724G 1697G 2027G 45.57 1.68 > 1 1.57899 1.00000 3724G 1706G 2018G 45.81 1.69 > 2 1.68900 1.00000 3724G 1794G 1929G 48.19 1.78 > 3 6.78499 1.00000 7450G 1240G 6209G 16.65 0.61 > 4 8.39999 1.00000 7450G 1226G 6223G 16.47 0.61 > 5 9.51500 1.00000 7450G 1237G 6212G 16.62 0.61 > 6 7.66499 1.00000 7450G 1264G 6186G 16.97 0.63 > 7 9.75499 1.00000 7450G 2494G 4955G 33.48 1.23 > 8 9.32999 1.00000 7450G 2491G 4958G 33.45 1.23 > TOTAL 55872G 15152G 40719G 27.12 > MIN/MAX VAR: 0.61/1.78 STDDEV: 13.54 > > # ceph health detail > HEALTH_WARN 4 pgs stuck unclean; recovery 3586/58734015 objects degraded > (0.006%); recovery 420074/58734015 objects misplaced (0.715%); > noscrub,nodeep-scrub flag(s) set > pg 9.7 is stuck unclean for 512936.160212, current state active+remapped, > last acting [7,3,0] > pg 7.84 is stuck unclean for 512623.894574, current state active+remapped, > last acting [4,8,1] > pg 8.1b is stuck unclean for 513164.616377, current state active+remapped, > last acting [4,7,2] > pg 7.7a is stuck unclean for 513162.316328, current state active+remapped, > last acting [7,4,2] > recovery 3586/58734015 objects degraded (0.006%) > recovery 420074/58734015 objects misplaced (0.715%) > noscrub,nodeep-scrub flag(s) set > > # ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -1 56.00693 root default > -2 1.28899 host ceph1 > 0 1.28899 osd.0 up 1.00000 1.00000 > -3 1.57899 host ceph2 > 1 1.57899 osd.1 up 1.00000 1.00000 > -4 1.68900 host ceph3 > 2 1.68900 osd.2 up 1.00000 1.00000 > -5 32.36497 host ceph4 > 3 6.78499 osd.3 up 1.00000 1.00000 > 4 8.39999 osd.4 up 1.00000 1.00000 > 5 9.51500 osd.5 up 1.00000 1.00000 > 6 7.66499 osd.6 up 1.00000 1.00000 > -6 19.08498 host ceph5 > 7 9.75499 osd.7 up 1.00000 1.00000 > 8 9.32999 osd.8 up 1.00000 1.00000 > > I’m using a customized crushmap because as you can see this cluster is not > very optimal. Ceph1, ceph2 and ceph3 are vms on one physical host - Ceph4 and > Ceph5 are both separate physical hosts. So the idea is to spread 33% of the > data to ceph1, ceph2 and ceph3 and the other 66% to each ceph4 and ceph5. > > Everything went fine with the backfilling but now I see those 4 pgs stuck > active+remapped since 2 days while the degrades objects increase. > > I did a restart of all osds after and after but this helped not really. It > first showed me no degraded objects and then increased again. > > What can I do in order to get those pgs to active+clean state again? My idea > was to increase the weight of a osd a little bit in order to let ceph > calculate the map again, is this a good idea? > > Trying google with "ceph pg stuck in active and remapped" points to a couple > of post on this ML typically indicating that it's a problem with the CRUSH > map and ceph being unable to satisfy the mapping rules. Your ceph -s output > indicates that your using replication of size 3 in your pools. You also said > you had a custom CRUSH map - can you post it? > > > --- > > On the other side I saw something very strange too: After the backfill was > done (2 days ago), my ceph osd df looked like this: > > # ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR > 0 1.28899 1.00000 3724G 1924G 1799G 51.67 1.79 > 1 1.57899 1.00000 3724G 2143G 1580G 57.57 2.00 > 2 1.68900 1.00000 3724G 2114G 1609G 56.78 1.97 > 3 6.78499 1.00000 7450G 1234G 6215G 16.57 0.58 > 4 8.39999 1.00000 7450G 1221G 6228G 16.40 0.57 > 5 9.51500 1.00000 7450G 1232G 6217G 16.54 0.57 > 6 7.66499 1.00000 7450G 1258G 6191G 16.89 0.59 > 7 9.75499 1.00000 7450G 2482G 4967G 33.33 1.16 > 8 9.32999 1.00000 7450G 2480G 4969G 33.30 1.16 > TOTAL 55872G 16093G 39779G 28.80 > MIN/MAX VAR: 0.57/2.00 STDDEV: 17.54 > > While ceph -s was: > > health HEALTH_WARN > 4 pgs stuck unclean > recovery 1698/58476648 objects degraded (0.003%) > recovery 418137/58476648 objects misplaced (0.715%) > noscrub,nodeep-scrub flag(s) set > monmap e9: 5 mons at > {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0 > > <http://192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0,ceph3=192.168.10.5:6789/0,ceph4=192.168.60.6:6789/0,ceph5=192.168.60.11:6789/0>} > election epoch 464, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5 > osdmap e3086: 9 osds: 9 up, 9 in; 4 remapped pgs > flags noscrub,nodeep-scrub > pgmap v9928160: 320 pgs, 3 pools, 4809 GB data, 19035 kobjects > 16093 GB used, 39779 GB / 55872 GB avail > 1698/58476648 objects degraded (0.003%) > 418137/58476648 objects misplaced (0.715%) > 316 active+clean > 4 active+remapped > client io 757 kB/s rd, 1 op/s > > > As you can see above my ceph osd df looks completely different -> This shows > that the first three osds lost data (about 1 TB) without any backfill going > on. If I calculate the amount of osd0, osd1 and osd2 it was 6181 GB. But > there should be only around 33%, so this would be wrong. > > I might be missing something here but I don't quite see how you come to this > statement. ceph osd df and ceph -s both show 16093 GB used and 39779 GB out > of 55872 GB available. The sum of the first 3 OSDs used space is, as you > stated, 6181 GB which is approx 38.4% so quite close to your target of 33% > > > My question on this is: Is this a bug and I really lost important data or is > this a ceph cleanup action after the backfill? > > Thanks and regards, > Marcus > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com