Hi,

We have a single inconsistent placement group where I then subsequently 
triggered a deep scrub and tried doing a 'pg repair'. The placement group 
remains in an inconsistent state.

How do I discard the objects for this placement group only on the one OSD and 
get Ceph to essentially write the data out new. Drives will only mark a sector 
as remapped when asked to overwrite the problematic sector or repeated reads of 
the failed sector eventually succeed (this is my limited understanding).

Nothing useful in the 'ceph pg 1.35 query' output that I could decipher. Then 
ran 'ceph pg deep-scrub 1.35' and 'rados list-inconsistent-obj 1.35' thereafter 
indicates a read error on one of the copies:
{"epoch":25776,"inconsistents":[{"object":{"name":"rbd_data.746f3c94fb3a42.000000000001e48d","nspace":"","locator":"","snap":"head","version":34866184},"errors":[],"union_shard_errors":["read_error"],"selected_object_info":{"oid":{"oid":"rbd_data.746f3c94fb3a42.000000000001e48d","key":"","snapid":-2,"hash":3814100149,"max":0,"pool":1,"namespace":""},"version":"22845'1781037","prior_version":"22641'1771494","last_reqid":"client.136837683.0:124047","user_version":34866184,"size":4194304,"mtime":"2020-03-08
 17:59:00.159846","local_mtime":"2020-03-08 
17:59:00.159670","lost":0,"flags":["dirty","data_digest","omap_digest"],"truncate_seq":0,"truncate_size":0,"data_digest":"0x031cb17c","omap_digest":"0xffffffff","expected_object_size":4194304,"expected_write_size":4194304,"alloc_hint_flags":0,"manifest":{"type":0},"watchers":{}},"shards":[{"osd":51,"primary":false,"errors":["read_error"],"size":4194304},{"osd":60,"primary":false,"errors":[],"size":4194304,"omap_digest":"0xffffffff","data_dig
 
est":"0x031cb17c"},{"osd":82,"primary":true,"errors":[],"size":4194304,"omap_digest":"0xffffffff","data_digest":"0x031cb17c"}]}]}

/var/log/syslog:
Mar 30 08:40:40 kvm1e kernel: [74792.229021] ata2.00: exception Emask 0x0 SAct 
0x2 SErr 0x0 action 0x0
Mar 30 08:40:40 kvm1e kernel: [74792.230416] ata2.00: irq_stat 0x40000008
Mar 30 08:40:40 kvm1e kernel: [74792.231715] ata2.00: failed command: READ 
FPDMA QUEUED
Mar 30 08:40:40 kvm1e kernel: [74792.233071] ata2.00: cmd 
60/00:08:00:7a:50/04:00:c9:00:00/40 tag 1 ncq dma 524288 in
Mar 30 08:40:40 kvm1e kernel: [74792.233071]          res 
43/40:00:10:7b:50/00:04:c9:00:00/00 Emask 0x409 (media error) <F>
Mar 30 08:40:40 kvm1e kernel: [74792.235736] ata2.00: status: { DRDY SENSE ERR }
Mar 30 08:40:40 kvm1e kernel: [74792.237045] ata2.00: error: { UNC }
Mar 30 08:40:40 kvm1e ceph-osd[450777]: 2020-03-30 08:40:40.240 7f48a41f3700 -1 
bluestore(/var/lib/ceph/osd/ceph-51) _do_read bdev-read failed: (5) 
Input/output error
Mar 30 08:40:40 kvm1e kernel: [74792.244914] ata2.00: configured for UDMA/133
Mar 30 08:40:40 kvm1e kernel: [74792.244938] sd 1:0:0:0: [sdb] tag#1 FAILED 
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 30 08:40:40 kvm1e kernel: [74792.244942] sd 1:0:0:0: [sdb] tag#1 Sense Key 
: Medium Error [current]
Mar 30 08:40:40 kvm1e kernel: [74792.244945] sd 1:0:0:0: [sdb] tag#1 Add. 
Sense: Unrecovered read error - auto reallocate failed
Mar 30 08:40:40 kvm1e kernel: [74792.244949] sd 1:0:0:0: [sdb] tag#1 CDB: 
Read(16) 88 00 00 00 00 00 c9 50 7a 00 00 00 04 00 00 00
Mar 30 08:40:40 kvm1e kernel: [74792.244953] blk_update_request: I/O error, dev 
sdb, sector 3377494800 op 0x0:(READ) flags 0x0 phys_seg 94 prio class 0
Mar 30 08:40:40 kvm1e kernel: [74792.246238] ata2: EH complete


Regards
David Herselman
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to