Hi, Hmm, could you try and dump the crush map - decompile it - modify it to remove the DNE osd's, compile it and load it back into ceph?
http://docs.ceph.com/docs/master/rados/operations/crush-map/#get-a-crush-map Thanks On Thu, Dec 29, 2016 at 1:01 PM, Łukasz Chrustek <ski...@tlen.pl> wrote: > Hi, > > ]# ceph osd tree > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > -7 16.89590 root ssd-disks > -11 0 host ssd1 > 598798032 0 osd.598798032 DNE 0 > 21940 0 osd.21940 DNE 0 > 71 0 osd.71 DNE 0 > > ]# ceph osd rm osd.598798032 > Error EINVAL: osd id 598798032 is too largeinvalid osd id-34 > ]# ceph osd rm osd.21940 > osd.21940 does not exist. > ]# ceph osd rm osd.71 > osd.71 does not exist. > > > ceph osd rm osd.$ID > > > On Thu, Dec 29, 2016 at 10:44 AM, Łukasz Chrustek <ski...@tlen.pl> > wrote: > > > Hi, > > > I was trying to delete 3 osds from cluster, deletion procces took very > > long time and I interrupted it. mon process then crushed, and in ceph > > osd tree (after restart ceph-mon) I saw: > > > ~]# ceph osd tree > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT > PRIMARY-AFFINITY > > -7 16.89590 root ssd-disks > > -11 0 host ssd1 > > -231707408 0 > > 22100 0 osd.22100 DNE 0 > > 71 0 osd.71 DNE 0 > > > > when I tried to delete osd.22100: > > > [root@cc1 ~]# ceph osd crush remove osd.22100 > > device 'osd.22100' does not appear in the crush map > > > then I tried to delete osd.71 and mon proccess crushed: > > > [root@cc1 ~]# ceph osd crush remove osd.71 > > 2016-12-28 17:52:34.459668 7f426a862700 0 monclient: hunting for new > mon > > > after restart of ceph-mon in ceph osd tree it shows: > > > # ceph osd tree > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT > PRIMARY-AFFINITY > > -7 16.89590 root ssd-disks > > -11 0 host ssd1 > > 598798032 0 osd.598798032 DNE 0 > > 21940 0 osd.21940 DNE 0 > > 71 0 osd.71 DNE 0 > > > My question is how to delete this osds without direct editing crushmap > > ? It is production system, I can't affort any service interruption :(, > > when I try to ceph osd crush remove then ceph-mon crushes.... > > > I dumped crushmap, but it took 19G (!!) after decompiling (compiled > > file is very small). So, I cleaned this file with perl (it take very > > long time), and I have now small txt crushmap, which I edited. But is > > there any chance that ceph will still remember somewhere about this > > huge numbers for osds ? Is it safe to apply this cleaned crushmap to > > cluster ? Cluster now works OK, but there is over 23TB production data > > which I can't loose. Please advice what to do. > > > > -- > > Regards > > Luk > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Pozdrowienia, > Łukasz Chrustek > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com