Re: [ceph-users] Question about mark_unfound_lost on RGW metadata.

Craig Lewis Mon, 07 Apr 2014 18:49:23 -0700

The PG with the unfound object has been in active+recovering+degradedstate for much longer than usual. Most PGs spend about 20 minutes inthat state, then complete. This one has been in this inactive+recovering+degraded for about 4 hours now.11.483 8851 1 8852 1 7974255906 3082 3082active+recovering+degraded 2014-04-07 10:31:53.14693013421'1242575 13855:1647415 [3,13] [3,13] 7936'10190312014-03-24 00:53:42.265828 7936'1019031 2014-03-24 00:53:42.265828

Is this because it can't find the unfound object? Or is this because Iset osd flag noout and nodown?

So far it's not a big deal. There's plenty of other backfilling andrecovery that needs to happen. It just seems strange to me.



*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*

Connect with us Website <http://www.centraldesktop.com/> | Twitter<http://www.twitter.com/centraldesktop> | Facebook<http://www.facebook.com/CentralDesktop> | LinkedIn<http://www.linkedin.com/groups?gid=147417> | Blog<http://cdblog.centraldesktop.com/>


On 4/7/14 14:38 , Craig Lewis wrote:

Ceph is telling me that it can't find some data:
2014-04-07 11:15:09.901992 mon.0 [INF] pgmap v5436846: 2592 pgs: 2164active+clean, 142 active+remapped+wait_backfill, 150active+degraded+wait_backfill, 1 active+recovering+degraded, 2active+degraded+backfilling, 133active+degraded+remapped+wait_backfill; 15094 GB data, 28749 GB used,30839 GB / 59588 GB avail; 3496837/37879443 objects degraded (9.231%);*1/18361235 unfound (0.000%)*; 25900 kB/s, 26 objects/s recovering
querying all the PGs tells me that 11.483 has 1 missing object, named.dir.us-west-1.51941060.1.
pg query says the recovery state is:
          "might_have_unfound": [
                { "osd": 11,
                  "status": "querying"},
                { "osd": 13,
                  "status": "already probed"}],
Active OSDs for this PG are [3,13], so osd.13 is the 2ndry for thisPG. osd.11 does not have the data. I recently replaced osd.11, andthis data was unfound before the drive swap. So it looks like I haveno choice but to use mark_unfound_lost.
I have some concerns though. Pool 11 is .rgw.buckets. I assume fromthe object's name, .dir.us-west-1 is related to replication. us-west-1is the master zone, and these errors are occuring in the slave zone(us-central-1).
What are the risks of using ceph pg {pgid} mark_unfound_lost revert onthat particular object? I'm comfortable losing objects in the slave,I can re-upload them to the master zone. I just want to make sure I'mnot going to render the slave zone unusable.
Thanks for the help.





--

*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter<http://www.twitter.com/centraldesktop> | Facebook<http://www.facebook.com/CentralDesktop> | LinkedIn<http://www.linkedin.com/groups?gid=147417> | Blog<http://cdblog.centraldesktop.com/>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Question about mark_unfound_lost on RGW metadata.

Reply via email to