Hi,

One simple/quick question.
In my ceph cluster, I had a disk wich was in predicted failure. It was so much 
in predicted failure that the ceph OSD daemon crashed.

After the OSD crashed, ceph moved data correctly (or at least that's what I 
thought), and a ceph -s was giving a "HEALTH_OK".
Perfect.
I tride to tell ceph to mark the OSD down : it told me the OSD was already 
down... fine.

Then I ran this :
ID=43 ; ceph osd down $ID ; ceph auth del osd.$ID ; ceph osd rm $ID ; ceph osd 
crush remove osd.$ID

And immediately after this, ceph told me :
# ceph -s
    cluster 70ac4a78-46c0-45e6-8ff9-878b37f50fa1
     health HEALTH_WARN
            37 pgs backfilling
            3 pgs stuck unclean
            recovery 12086/355688 objects misplaced (3.398%)
     monmap e2: 3 mons at 
{ceph0=192.54.207.70:6789/0,ceph1=192.54.207.71:6789/0,ceph2=192.54.207.72:6789/0}
            election epoch 938, quorum 0,1,2 ceph0,ceph1,ceph2
     mdsmap e64: 1/1/1 up {0=ceph1=up:active}, 1 up:standby-replay, 1 up:standby
     osdmap e25455: 119 osds: 119 up, 119 in; 35 remapped pgs
      pgmap v5473702: 3212 pgs, 10 pools, 378 GB data, 97528 objects
            611 GB used, 206 TB / 207 TB avail
            12086/355688 objects misplaced (3.398%)
                3175 active+clean
                  37 active+remapped+backfilling
  client io 192 kB/s rd, 1352 kB/s wr, 117 op/s

Off course, I'm sure the OSD 43 was the one that was down ;)
My question therefore is :

If ceph successfully and automatically migrated data off the down/out OSD, why 
is there even anything happening once I tell ceph to forget about this osd ?
Was the cluster not "HEALTH OK" after all ?

(ceph-0.94.6-0.el7.x86_64 for now)

Thanks && regards

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to