Since the disk is failing and you have 2 other copies I would take osd.0 down.  
This means that ceph will not attempt to read the bad disk either for clients 
or to make another copy of the data:

***** Not sure about the syntax of this for the version of ceph you are running
ceph osd down 0

Mark it “out” which will immediately trigger recovery to create more copies of 
the data with the remaining OSDs.
ceph osd out 0

You can now finish the process of removing the osd by looking at these 
instructions:

http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#removing-osds-manual

David Zafman
Senior Developer
http://www.inktank.com

On Nov 12, 2013, at 3:16 AM, Mihály Árva-Tóth 
<mihaly.arva-t...@virtual-call-center.eu> wrote:

> Hello,
> 
> I have 3 node, with 3 OSD in each node. I'm using .rgw.buckets pool with 3 
> replica. One of my HDD (osd.0) has just bad sectors, when I try to read an 
> object from OSD direct, I get Input/output errror. dmesg:
> 
> [1214525.670065] mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), 
> sub_code(0x0000)
> [1214525.670072] mpt2sas0: log_info(0x31080000): originator(PL), code(0x08), 
> sub_code(0x0000)
> [1214525.670100] sd 0:0:2:0: [sdc] Unhandled sense code
> [1214525.670104] sd 0:0:2:0: [sdc]  
> [1214525.670107] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [1214525.670110] sd 0:0:2:0: [sdc]  
> [1214525.670112] Sense Key : Medium Error [current] 
> [1214525.670117] Info fld=0x60c8f21
> [1214525.670120] sd 0:0:2:0: [sdc]  
> [1214525.670123] Add. Sense: Unrecovered read error
> [1214525.670126] sd 0:0:2:0: [sdc] CDB: 
> [1214525.670128] Read(16): 88 00 00 00 00 00 06 0c 8f 20 00 00 00 08 00 00
> 
> Okay I known need to replace HDD.
> 
> Fragment of ceph -s  output:
>   pgmap v922039: 856 pgs: 855 active+clean, 1 active+clean+inconsistent;
> 
> ceph pg dump | grep inconsistent
> 
> 11.15d  25443   0       0       0       6185091790      3001    3001    
> active+clean+inconsistent       2013-11-06 02:30:45.23416.....
> 
> ceph pg map 11.15d
> 
> osdmap e1600 pg 11.15d (11.15d) -> up [0,8,3] acting [0,8,3]
> 
> pg repair or deep-scrub can not fix this issue. But if I understand 
> correctly, osd has to known it can not retrieve object from osd.0 and need to 
> be replicate an another osd because there is no 3 working replicas now.
> 
> Thank you,
> Mihaly
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to