Roeland, We're seeing the same problems in our cluster. I can't offer you a solution that gets the OSD back, but I can tell you what I did to work around it.
We're running 5 0.94.6 clusters with 9 nodes / 648 HDD OSDs with a k=7, m=2 erasure coded .rgw.buckets pool. During the backfilling after a recent disk replacement, we had four OSDs that got in a very similar state. 2016-08-09 07:40:12.475699 7f025b06b700 -1 osd/ECBackend.cc: In function 'void ECBackend::handle_recovery_push(PushOp&, RecoveryMessages*)' thread 7f025b06b700 time 2016-08-09 07:40:12.472819 osd/ECBackend.cc: 281: FAILED assert(op.attrset.count(string("_"))) ceph version 0.94.6-2 (f870be457b16e4ff56ced74ed3a3c9a4c781f281) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xba997b] 2: (ECBackend::handle_recovery_push(PushOp&, RecoveryMessages*)+0xd7f) [0xa239ff] 3: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x1de) [0xa2600e] 4: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x167) [0x8305e7] 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3bd) [0x6a157d] 6: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x338) [0x6a1aa8] 7: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85f) [0xb994cf] 8: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xb9b5f0] 9: (()+0x8184) [0x7f0284e35184] 10: (clone()+0x6d) [0x7f028324c37d] To allow the cluster to recover, we ended up reweighting the OSDs that got into this state to 0 (ceph osd crush reweight <osd-id> 0). This will of course kick off a long round of backfilling, but it eventually recovers. We've never found a solution that gets the OSD healthy again that doesn't involve nuking the underlying disk and starting over. We've had 10 OSDs get in this state across 2 clusters in the last few months. The failure/crash message is always the same. If someone does know of a way to recover the OSD, that would be great. I hope this helps. Brian Felton On Wed, Aug 10, 2016 at 10:17 AM, Roeland Mertens < roeland.mert...@genomicsplc.com> wrote: > Hi, > > we run a Ceph 10.2.1 cluster across 35 nodes with a total of 595 OSDs, we > have a mixture of normally replicated volumes and EC volumes using the > following erasure-code-profile: > > # ceph osd erasure-code-profile get rsk8m5 > jerasure-per-chunk-alignment=false > k=8 > m=5 > plugin=jerasure > ruleset-failure-domain=host > ruleset-root=default > technique=reed_sol_van > w=8 > > Now we had a disk failure and on swap out we seem to have encountered a > bug where during recovery OSDs crash when trying to fix certain pgs that > may have been corrupted. > > For example: > -3> 2016-08-10 12:38:21.302938 7f893e2d7700 5 -- op tracker -- seq: > 3434, time: 2016-08-10 12:38:21.302938, event: queued_for_pg, op: > MOSDECSubOpReadReply(63.1a18s0 47661 ECSubReadReply(tid=1, attrs_read=0)) > -2> 2016-08-10 12:38:21.302981 7f89bef50700 1 -- > 10.93.105.11:6831/2674119 --> 10.93.105.22:6802/357033 -- > osd_map(47662..47663 src has 32224..47663) v3 -- ?+0 0x559c1057f3c0 con > 0x559c0664a700 > -1> 2016-08-10 12:38:21.302996 7f89bef50700 5 -- op tracker -- seq: > 3434, time: 2016-08-10 12:38:21.302996, event: reached_pg, op: > MOSDECSubOpReadReply(63.1a18s0 47661 ECSubReadReply(tid=1, attrs_read=0)) > 0> 2016-08-10 12:38:21.306193 7f89bef50700 -1 osd/ECBackend.cc: In > function 'virtual void > OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, > ECBackend::read_result_t&>&)' thread 7f89bef50700 time 2016-08-10 > 12:38:21.303012 > osd/ECBackend.cc: 203: FAILED assert(res.errors.empty()) > > then the ceph-osd daemon goes splat. I've attached an extract of a logfile > showing a bit more. > > Anyone have any ideas? I'm stuck now with a pg that's stuck as > down+remapped+peering. ceph pg query tells me that peering is blocked to > the loss of an osd, though restarting it just results in another crash of > the ceph-osd daemon. We tried to force a rebuild by using > ceph-objectstore-tool to delete the pg segment on some of the OSDs that are > crashing but that didn't help one iota. > > Any help would be greatly appreciated, > > regards, > > Roeland > > -- > This email is sent on behalf of Genomics plc, a public limited company > registered in England and Wales with registered number 8839972, VAT > registered number 189 2635 65 and registered office at King Charles House, > Park End Street, Oxford, OX1 1JD, United Kingdom. > The contents of this e-mail and any attachments are confidential to the > intended recipient. If you are not the intended recipient please do not use > or publish its contents, contact Genomics plc immediately at > i...@genomicsplc.com <i...@genomicsltd.com> then delete. You may not > copy, forward, use or disclose the contents of this email to anybody else > if you are not the intended recipient. Emails are not secure and may > contain viruses. > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com