> Op 27 juni 2017 om 11:17 schreef SCHAER Frederic <frederic.sch...@cea.fr>:
> 
> 
> Hi,
> 
> Every now and then , sectors die on disks.
> When this happens on my bluestore (kraken) OSDs, I get 1 PG that becomes 
> degraded.
> The exact status is :
> 
> 
> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
> 
> pg 12.127 is active+clean+inconsistent, acting [141,67,85]
> 
> If I do a # rados list-inconsistent-obj 12.127 --format=json-pretty
> I get :
> (...)
> 
>                     "osd": 112,
> 
>                     "errors": [
> 
>                         "read_error"
> 
>                     ],
> 
>                     "size": 4194304
> 
> When this happens, I'm forced to manually run "ceph pg repair" on the 
> inconsistent PGs after I made sure this was a read error : I feel this should 
> not be a manual process.
> 
> If I go on the machine and look at the syslogs, I indeed see a sector read 
> error happened once or twice.
> But if I try to read the sector manually, then I can because it was 
> reallocated on the disk I presume.
> Last time this happened, I ran badblocks on the disk and it found no issue...
> 
> My question therefore are :
> 
> why doen't bluestore retry reading the sector (in case of transient errors) ? 
> (maybe it does)
> why isn't the pg automatically fixed when a read error was detected ?
> what will happen when the disks get old and reach up to 2048 bad sectors 
> before the controllers/smart declare them as "failure predicted" ?
> I can't imagine manually fixing  up to Nx2048 PGs in an infrastructure of N 
> disks where N could reach the sky...
> 
> Ideas ?

Try the Luminous RC? A lot has changed since Kraken was released. Using 
Luminous might help you.

Wido

> 
> Thanks && regards
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to