On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk <n...@fisk.me.uk> wrote: > > > That doesn't look like an RBD object -- any idea who is > > "client.34720596.1:212637720"? > > So I think these might be proxy ops from the cache tier, as there are also > block ops on one of the cache tier OSD's, but this time it actually lists > the object name. Block op on cache tier. > > "description": "osd_op(client.34720596.1:212637720 17.ae78c1cf > 17:f3831e75:::rbd_data.15a5e20238e1f29.00000000000388ad:head > [set-alloc-hint > object_size 4194304 write_size 4194304,write 2584576~16384] snapc 0=[] > RETRY=2 ondisk+retry+write+known_if_redirected e104841)", > "initiated_at": "2017-12-12 16:25:32.435718", > "age": 13996.681147, > "duration": 13996.681203, > "type_data": { > "flag_point": "reached pg", > "client_info": { > "client": "client.34720596", > "client_addr": "10.3.31.41:0/2600619462", > "tid": 212637720 > > I'm a bit baffled at the moment what's going. The pg query (attached) is > not > showing in the main status that it has been blocked from peering or that > there are any missing objects. I've tried restarting all OSD's I can see > relating to the PG in case they needed a bit of a nudge. >
Did that fix anything? I don't see anything immediately obvious but I'm not practiced in quickly reading that pg state output. What's the output of "ceph -s"? > > > > > On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk <n...@fisk.me.uk> wrote: > > > Does anyone know what this object (0.ae78c1cf) might be, it's not your > > > normal run of the mill RBD object and I can't seem to find it in the > > > pool using rados --all ls . It seems to be leaving the 0.1cf PG stuck > > > in an > > > activating+remapped state and blocking IO. Pool 0 is just a pure RBD > > > activating+pool > > > with a cache tier above it. There is no current mention of unfound > > > objects or any other obvious issues. > > > > > > There is some backfilling going on, on another OSD which was upgraded > > > to bluestore, which was when the issue started. But I can't see any > > > link in the PG dump with upgraded OSD. My only thought so far is to > > > wait for this backfilling to finish and then deep-scrub this PG and > > > see if that reveals anything? > > > > > > Thanks, > > > Nick > > > > > > "description": "osd_op(client.34720596.1:212637720 0.1cf 0.ae78c1cf > > > (undecoded) > > > ondisk+retry+write+ignore_cache+ignore_overlay+known_if_redirected > > > e105014)", > > > "initiated_at": "2017-12-12 17:10:50.030660", > > > "age": 335.948290, > > > "duration": 335.948383, > > > "type_data": { > > > "flag_point": "delayed", > > > "events": [ > > > { > > > "time": "2017-12-12 17:10:50.030660", > > > "event": "initiated" > > > }, > > > { > > > "time": "2017-12-12 17:10:50.030692", > > > "event": "queued_for_pg" > > > }, > > > { > > > "time": "2017-12-12 17:10:50.030719", > > > "event": "reached_pg" > > > }, > > > { > > > "time": "2017-12-12 17:10:50.030727", > > > "event": "waiting for peered" > > > }, > > > { > > > "time": "2017-12-12 17:10:50.197353", > > > "event": "reached_pg" > > > }, > > > { > > > "time": "2017-12-12 17:10:50.197355", > > > "event": "waiting for peered" > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > -- > > Jason > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com