Hi,

Hmm, could you try and dump the crush map - decompile it - modify it to
remove the DNE osd's, compile it and load it back into ceph?

http://docs.ceph.com/docs/master/rados/operations/crush-map/#get-a-crush-map

Thanks

On Thu, Dec 29, 2016 at 1:01 PM, Łukasz Chrustek <ski...@tlen.pl> wrote:

> Hi,
>
> ]# ceph osd tree
> ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
>        -7  16.89590 root ssd-disks
>       -11         0     host ssd1
> 598798032         0         osd.598798032     DNE        0
>     21940         0         osd.21940         DNE        0
>        71         0         osd.71            DNE        0
>
> ]# ceph osd rm osd.598798032
> Error EINVAL: osd id 598798032 is too largeinvalid osd id-34
> ]# ceph osd rm osd.21940
> osd.21940 does not exist.
> ]# ceph osd rm osd.71
> osd.71 does not exist.
>
> > ceph osd rm osd.$ID
>
> > On Thu, Dec 29, 2016 at 10:44 AM, Łukasz Chrustek <ski...@tlen.pl>
> wrote:
>
> > Hi,
>
> >  I was trying to delete 3 osds from cluster, deletion procces took very
> >  long  time and I interrupted it. mon process then crushed, and in ceph
> >  osd tree (after restart ceph-mon) I saw:
>
> >   ~]# ceph osd tree
> >  ID         WEIGHT    TYPE NAME            UP/DOWN REWEIGHT
> PRIMARY-AFFINITY
> >          -7  16.89590 root ssd-disks
> >         -11         0     host ssd1
> >  -231707408         0
> >       22100         0         osd.22100        DNE        0
> >          71         0         osd.71           DNE        0
>
>
> >  when I tried to delete osd.22100:
>
> >  [root@cc1 ~]# ceph osd crush remove osd.22100
> >  device 'osd.22100' does not appear in the crush map
>
> >  then I tried to delete osd.71 and mon proccess crushed:
>
> >  [root@cc1 ~]# ceph osd crush remove osd.71
> >  2016-12-28 17:52:34.459668 7f426a862700  0 monclient: hunting for new
> mon
>
> >  after restart of ceph-mon in ceph osd tree it shows:
>
> >  # ceph osd tree
> >  ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT
> PRIMARY-AFFINITY
> >         -7  16.89590 root ssd-disks
> >        -11         0     host ssd1
> >  598798032         0         osd.598798032     DNE        0
> >      21940         0         osd.21940         DNE        0
> >         71         0         osd.71            DNE        0
>
> >  My question is how to delete this osds without direct editing crushmap
> >  ? It is production system, I can't affort any service interruption :(,
> >  when I try to ceph osd crush remove then ceph-mon crushes....
>
> >  I  dumped  crushmap,  but it took 19G (!!) after decompiling (compiled
> >  file  is  very small). So, I cleaned this file with perl (it take very
> >  long  time), and I have now small txt crushmap, which I edited. But is
> >  there  any  chance  that ceph will still remember somewhere about this
> >  huge  numbers  for osds ? Is it safe to apply this cleaned crushmap to
> >  cluster ? Cluster now works OK, but there is over 23TB production data
> >  which I can't loose. Please advice what to do.
>
>
> >  --
> >  Regards
> >  Luk
>
> >  _______________________________________________
> >  ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
> --
> Pozdrowienia,
>  Łukasz Chrustek
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to