Hi Frank

> I'm not sure if my hypothesis can be correct. Ceph sends an acknowledge of a 
> write only after all copies are on disk. In other words, if PGs end up on 
> different versions after a power outage, one always needs to roll back. Since 
> you have two healthy OSDs in the PG and the PG is active (successfully 
> peered), it might just be a broken disk and read/write errors. I would focus 
> on that.

I tried to revert the PG as follows:
# ceph pg 3.b query | grep version        "last_user_version": 2263481,        
"version": "4825'2264303",
        "last_user_version": 2263481,        "version": "4825'2264301",
        "last_user_version": 2263481,        "version": "4825'2264301",

ceph pg 3.b list_unfound 
{    "num_missing": 0,    "num_unfound": 0,    "objects": [],    "more": false}

# ceph pg 3.b mark_unfound_lost revertpg has no unfound objects

# ceph pg 3.b revertInvalid command: revert not in querypg <pgid> query :  show 
details of a specific pgError EINVAL: invalid command

How to revert/rollback a PG?

> Another question, do you have write caches enabled (disk cache and controller 
> cache)? This is know to cause problems on power outages and also degraded 
> performance with ceph. You should check and disable any caches if necessary.

No. HDD is directly connected to motherboard.
Thank you
Sagara

  
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to