Hi Greg, Thanks for your findings! We've updated the issue with the log files of osd.93 and osd.69 which corresponds to the period of the log we posted. Also, we've recreated a new set of logs for that pair of OSDs. As we explain in the issue, right now the OSDs fail on that other assert you mentioned, and, sadly enough we can't reproduce the initial crash.
Greetings, Ana On 19/10/17 01:14, Gregory Farnum wrote: > I updated the ticket with some findings. It appears that osd.93 has > that snapshot object in its missing set that gets sent to osd.78, and > then osd.69 claims to have the object. Can you upload debug logs of > those OSDs that go along with this log? (Or just generate a new set of > them together.) > -Greg > > On Wed, Oct 18, 2017 at 9:16 AM Mart van Santen <m...@greenhost.nl > <mailto:m...@greenhost.nl>> wrote: > > > Dear all, > > We are still struggling this this issue. By now, one OSD crashes > all the time (a different then yesterday), but now on a different > assert. > > > Namely with this one: > > #0 0x00007ffff5464428 in __GI_raise (sig=sig@entry=6) at > ../sysdeps/unix/sysv/linux/raise.c:54 > #1 0x00007ffff546602a in __GI_abort () at abort.c:89 > #2 0x0000555555ff157e in ceph::__ceph_assert_fail > (assertion=assertion@entry=0x555556564e97 "head_obc", > file=file@entry=0x555556566bd8 > "/build/ceph-12.2.1/src/osd/PrimaryLogPG.cc", line=line@entry=10369, > func=func@entry=0x55555656d940 > <PrimaryLogPG::recover_missing(hobject_t const&, eversion_t, int, > PGBackend::RecoveryHandle*)::__PRETTY_FUNCTION__> "int > PrimaryLogPG::recover_missing(const hobject_t&, eversion_t, int, > PGBackend::RecoveryHandle*)") at > /build/ceph-12.2.1/src/common/assert.cc:66 > #3 0x0000555555b833e9 in PrimaryLogPG::recover_missing > (this=this@entry=0x555562aca000, soid=..., v=..., > priority=<optimized out>, h=h@entry=0x555567a3a080) at > /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc:10369 > #4 0x0000555555bc3fd0 in PrimaryLogPG::recover_primary > (this=this@entry=0x555562aca000, max=max@entry=1, handle=...) at > /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc:11588 > #5 0x0000555555bcc81e in PrimaryLogPG::start_recovery_ops > (this=0x555562aca000, max=1, handle=..., > ops_started=0x7fffd8b1ac68) at > /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc:11339 > #6 0x0000555555a20a59 in OSD::do_recovery (this=0x55555f95a000, > pg=0x555562aca000, queued=384619, reserved_pushes=1, handle=...) > at /build/ceph-12.2.1/src/osd/OSD.cc:9381 > #7 0x0000555555c94be9 in PGQueueable::RunVis::operator() > (this=this@entry=0x7fffd8b1af00, op=...) at > /build/ceph-12.2.1/src/osd/PGQueueable.cc:34 > #8 0x0000555555a226c4 in > > boost::detail::variant::invoke_visitor<PGQueueable::RunVis>::internal_visit<PGRecovery> > (operand=..., this=<synthetic pointer>) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/variant.hpp:1046 > #9 > > boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, > void*, PGRecovery> (storage=0x7fffd8b1af50, visitor=<synthetic > pointer>) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/detail/visitation_impl.hpp:114 > #10 > > boost::detail::variant::visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, > void*, PGRecovery, boost::variant<boost::intrusive_ptr<OpRequest>, > PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_> ( > t=0x0, storage=0x7fffd8b1af50, visitor=<synthetic pointer>, > internal_which=<optimized out>) at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/detail/visitation_impl.hpp:157 > #11 boost::detail::variant::visitation_impl<mpl_::int_<0>, > > boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, > boost::intrusive_ptr<OpRequest>, > boost::mpl::l_item<mpl_::long_<3l>, PGSnapTrim, > boost::mpl::l_item<mpl_::long_<2l>, PGScrub, > boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > > >, boost::mpl::l_iter<boost::mpl::l_end> >, > boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, > void*, boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, > PGScrub, PGRecovery>::has_fallback_type_> (no_backup_flag=..., > storage=0x7fffd8b1af50, visitor=<synthetic pointer>, > logical_which=<optimized out>, internal_which=<optimized out>) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/detail/visitation_impl.hpp:238 > #12 boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, > PGScrub, > > PGRecovery>::internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis>, > void*> (storage=0x7fffd8b1af50, visitor=<synthetic pointer>, > logical_which=<optimized out>, internal_which=<optimized out>) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/variant.hpp:2389 > #13 boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, > PGScrub, > > PGRecovery>::internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis> > > (visitor=<synthetic pointer>, this=0x7fffd8b1af48) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/variant.hpp:2400 > #14 boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, > PGScrub, PGRecovery>::apply_visitor<PGQueueable::RunVis> > (visitor=..., this=0x7fffd8b1af48) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/variant.hpp:2423 > #15 boost::apply_visitor<PGQueueable::RunVis, > boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, > PGScrub, PGRecovery> > (visitable=..., visitor=...) > at > > /build/ceph-12.2.1/obj-x86_64-linux-gnu/boost/include/boost/variant/detail/apply_visitor_unary.hpp:70 > #16 PGQueueable::run (handle=..., pg=..., osd=<optimized out>, > this=0x7fffd8b1af48) at /build/ceph-12.2.1/src/osd/PGQueueable.h:140 > #17 OSD::ShardedOpWQ::_process (this=0x55555f95b5d8, > thread_index=<optimized out>, hb=0x555567a1e1e0) at > /build/ceph-12.2.1/src/osd/OSD.cc:10327 > #18 0x0000555555ff61e4 in > ShardedThreadPool::shardedthreadpool_worker (this=0x55555f95ab88, > thread_index=4) at /build/ceph-12.2.1/src/common/WorkQueue.cc:339 > #19 0x0000555555ff9220 in > ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) > at /build/ceph-12.2.1/src/common/WorkQueue.h:689 > #20 0x00007ffff64bf6ba in start_thread (arg=0x7fffd8b1d700) at > pthread_create.c:333 > #21 0x00007ffff55363dd in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > > We are not gdb experts, however, > > When we request the arguments of the 3th frame, it tells us, > > #3 0x0000555555b833e9 in PrimaryLogPG::recover_missing > (this=this@entry=0x555562aca000, soid=..., v=..., > priority=<optimized out>, h=h@entry=0x555567a3a080) at > /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc:10369 > 10369 /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc: No such file > or directory. > (gdb) info args > this = 0x555562aca000 > soid = @0x7fffd8b1a6c0: {oid = {name = > "rbd_data.30732d3238f3535.", '0' <repeats 12 times>, "f7e3"}, snap > = {val = 244802}, hash = 3610026748, max = false, > nibblewise_key_cache = 3475686013, hash_reverse_bits = 1061500139, > static POOL_META = -1, static POOL_TEMP_START = -2, pool = 2, > nspace = "", key = ""} > v = <optimized out> > priority = <optimized out> > h = 0x555567a3a080 > (gdb) Quit > > > So it points to: rbd_data.30732d3238f3535.(0x12)f7e3 > > > This is weird, as we cannot found any reference to > "rbd_data.30732d3238f3535.000000000000f7e3", this rbd prefix does > not exist on our system (when running rbd ls and requesting info > of the rbds,30732d3238f3535 is not found ), nor we can find any > object on the filesystem which looks to belong to this rbd image. > It can of course be an old rbd image which is deleted by now, but > we do not understand why it is still trying to access it or repair > it. It seems this repair is in some queue or something. > > This is happening in the same PG as the problem we found yesterday. > > > All advise is welcome, as we still have one pg down.. > > > best, > > > mart > > > > > > > > On 10/18/2017 12:39 PM, Ana Aviles wrote: >> >> Hello, >> >> We created a BUG #21827 . Also updated the log file of the OSD >> with debug 20. Reference is 6e4dba6f-2c15-4920-b591-fe380bbca200 >> >> Thanks, >> Ana >> >> On 18/10/17 00:46, Mart van Santen wrote: >>> >>> >>> Hi Greg, >>> >>> (I'm a colleague of Ana), Thank you for your reply >>> >>> >>> On 10/17/2017 11:57 PM, Gregory Farnum wrote: >>>> >>>> >>>> On Tue, Oct 17, 2017 at 9:51 AM Ana Aviles <a...@greenhost.nl >>>> <mailto:a...@greenhost.nl>> wrote: >>>> >>>> Hello all, >>>> >>>> We had an inconsistent PG on our cluster. While performing >>>> PG repair >>>> operation the OSD crashed. The OSD was not able to start >>>> again anymore, >>>> and there was no hardware failure on the disk itself. This >>>> is the log output >>>> >>>> 2017-10-17 17:48:55.771384 7f234930d700 -1 >>>> log_channel(cluster) log >>>> [ERR] : 2.2fc repair 1 missing, 0 inconsistent objects >>>> 2017-10-17 17:48:55.771417 7f234930d700 -1 >>>> log_channel(cluster) log >>>> [ERR] : 2.2fc repair 3 errors, 1 fixed >>>> 2017-10-17 17:48:56.047896 7f234930d700 -1 >>>> /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc: In function >>>> 'virtual void >>>> PrimaryLogPG::on_local_recover(const hobject_t&, const >>>> ObjectRecoveryInfo&, ObjectContextRef, bool, >>>> ObjectStore::Transaction*)' >>>> thread 7f234930d700 time 2017-10-17 17:48:55.924115 >>>> /build/ceph-12.2.1/src/osd/PrimaryLogPG.cc: 358: FAILED >>>> assert(p != >>>> recovery_info.ss.clone_snaps.end()) >>>> >>>> >>>> Hmm. The OSD got a push op containing a snapshot it doesn't >>>> think should exist. I also see that there's a comment "// hmm, >>>> should we warn?" on that assert. >>> >>> >>> We catched also those log entries, which indeed point to a >>> clone/snapshot problem: >>> >>> -9877> 2017-10-17 17:46:16.044077 7f234db16700 10 log_client >>> will send 2017-10-17 17:46:13.367842 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 483 : cluster [ERR] 2.2fc >>> shard 78 missing >>> 2:3f72b543:::rbd_data.332d5a836bcc485.000000000000fcf6:466a7 >>> -9876> 2017-10-17 17:46:16.044105 7f234db16700 10 log_client >>> will send 2017-10-17 17:46:13.368026 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 484 : cluster [ERR] repair >>> 2.2fc >>> 2:3f72b543:::rbd_data.332d5a836bcc485.000000000000fcf6:466a7 is >>> an unexpected clone >>> -9868> 2017-10-17 17:46:16.324112 7f2354b24700 10 log_client >>> logged 2017-10-17 17:46:13.367842 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 483 : cluster [ERR] 2.2fc >>> shard 78 missing >>> 2:3f72b543:::rbd_data.332d5a836bcc485.000000000000fcf6:466a7 >>> -9867> 2017-10-17 17:46:16.324128 7f2354b24700 10 log_client >>> logged 2017-10-17 17:46:13.368026 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 484 : cluster [ERR] repair >>> 2.2fc >>> 2:3f72b543:::rbd_data.332d5a836bcc485.000000000000fcf6:466a7 is >>> an unexpected clone >>> -36> 2017-10-17 17:48:55.771384 7f234930d700 -1 >>> log_channel(cluster) log [ERR] : 2.2fc repair 1 missing, 0 >>> inconsistent objects >>> -35> 2017-10-17 17:48:55.771417 7f234930d700 -1 >>> log_channel(cluster) log [ERR] : 2.2fc repair 3 errors, 1 fixed >>> -4> 2017-10-17 17:48:56.046071 7f234db16700 10 log_client >>> will send 2017-10-17 17:48:55.771390 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 485 : cluster [ERR] 2.2fc >>> repair 1 missing, 0 inconsistent objects >>> -3> 2017-10-17 17:48:56.046088 7f234db16700 10 log_client >>> will send 2017-10-17 17:48:55.771419 osd.78 osd.78 >>> [XXXX:XXXX:XXXX:XXXX::203]:6880/9116 486 : cluster [ERR] 2.2fc >>> repair 3 errors, 1 fixed >>> >>>> >>>> Can you take a full log with "debug osd = 20" set, post it with >>>> ceph-post-file, and create a ticket on tracker.ceph.com >>>> <http://tracker.ceph.com>? >>> >>> We will submit the ticket tomorrow (we are in CEST), We want to >>> have more pair of eyes on it when we start the OSD again. >>> >>> After this crash the OSD was marked as out by us. The cluster >>> rebalanced itself, unfortunately, the same issue appear on >>> another OSD (same pg), after several crashes of this OSD, the >>> OSD came back up, but now with one PG down. I assume the cluster >>> decided it 'finished' the ceph pg repair command and removed the >>> 'repair' state, but now with a broken pg. If you have any hints >>> on how we can get the PG online again, we would be very >>> grateful, so we can work on that tomorrow. >>> >>> >>> Thanks, >>> >>> Mart >>> >>> >>> >>> >>> Some general info about this cluster: >>> >>> - all OSD runs the same version, also monitors are all 12.2.1 >>> (ubuntu xenial) >>> >>> - the cluster is a backup cluster and has min/size 1 and >>> replication 2, so only 2 copies. >>> >>> - the cluster was recently upgraded from jewel to luminous (3 >>> weeks ago) >>> >>> - the cluster was recently upgraded from straw to straw2 (1 week >>> ago) >>> >>> - it was in HEALTH_OK till this happend. >>> >>> - we use filestore only >>> >>> - the cluster was installed with hammer originally. upgraded to >>> infernalis, jewel and now luminous >>> >>> >>> >>> health: >>> (noup/noout set on purpose while we trying to recover) >>> >>> >>> $ ceph -s >>> cluster: >>> id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx >>> health: HEALTH_WARN >>> noup,noout flag(s) set >>> Reduced data availability: 1 pg inactive, 1 pg down >>> Degraded data redundancy: 2892/31621143 objects >>> degraded (0.009%), 2 pgs unclean, 1 pg degraded, 1 pg undersized >>> >>> services: >>> mon: 3 daemons, quorum ds2-mon1,ds2-mon2,ds2-mon3 >>> mgr: ds2-mon1(active) >>> osd: 93 osds: 92 up, 92 in; 1 remapped pgs >>> flags noup,noout >>> rgw: 1 daemon active >>> >>> data: >>> pools: 13 pools, 1488 pgs >>> objects: 15255k objects, 43485 GB >>> usage: 119 TB used, 126 TB / 245 TB avail >>> pgs: 0.067% pgs not active >>> 2892/31621143 objects degraded (0.009%) >>> 1483 active+clean >>> 2 active+clean+scrubbing+deep >>> 1 active+undersized+degraded+remapped+backfilling >>> 1 active+clean+scrubbing >>> 1 down >>> >>> io: >>> client: 340 B/s rd, 14995 B/s wr, 1 op/s rd, 2 op/s wr >>> recovery: 9567 kB/s, 2 objects/s >>> >>> >>> >>> $ ceph health detail >>> HEALTH_WARN noup,noout flag(s) set; Reduced data availability: 1 >>> pg inactive, 1 pg down; Degraded data redundancy: 2774/31621143 >>> objects degraded (0.009%), 2 pgs unclean, 1 pg degraded, 1 pg >>> undersized >>> OSDMAP_FLAGS noup,noout flag(s) set >>> PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg down >>> pg 2.2fc is down, acting [69,93] >>> PG_DEGRADED Degraded data redundancy: 2774/31621143 objects >>> degraded (0.009%), 2 pgs unclean, 1 pg degraded, 1 pg undersized >>> pg 2.1e9 is stuck undersized for 23741.295159, current state >>> active+undersized+degraded+remapped+backfilling, last acting [41] >>> pg 2.2fc is stuck unclean since forever, current state down, >>> last acting [69,93] >>> >>> >>> >>>> >>>> Are all your OSDs running that same version? >>>> -Greg >>>> >>>> >>>> >>>> ceph version 12.2.1 >>>> (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous >>>> (stable) >>>> 1: (ceph::__ceph_assert_fail(char const*, char const*, >>>> int, char >>>> const*)+0x102) [0x56236c8ff3f2] >>>> 2: (PrimaryLogPG::on_local_recover(hobject_t const&, >>>> ObjectRecoveryInfo >>>> const&, std::shared_ptr<ObjectContext>, bool, >>>> ObjectStore::Transaction*)+0xd63) [0x56236c476213] >>>> 3: (ReplicatedBackend::handle_pull_response(pg_shard_t, >>>> PushOp const&, >>>> PullOp*, >>>> std::__cxx11::list<ReplicatedBackend::pull_complete_info, >>>> std::allocator<ReplicatedBackend::pull_complete_info> >*, >>>> ObjectStore::Transaction*)+0x693) [0x56236c60d4d3] >>>> 4: >>>> >>>> (ReplicatedBackend::_do_pull_response(boost::intrusive_ptr<OpRequest>)+0x2b5) >>>> [0x56236c60dd75] >>>> 5: >>>> >>>> (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x20c) >>>> [0x56236c61196c] >>>> 6: >>>> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) >>>> [0x56236c521aa0] >>>> 7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, >>>> ThreadPool::TPHandle&)+0x55d) [0x56236c48662d] >>>> 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) >>>> [0x56236c3091a9] >>>> 9: >>>> (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> >>>> const&)+0x57) [0x56236c5a2ae7] >>>> 10: (OSD::ShardedOpWQ::_process(unsigned int, >>>> ceph::heartbeat_handle_d*)+0x130e) [0x56236c3307de] >>>> 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>> int)+0x884) >>>> [0x56236c9041e4] >>>> 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>> [0x56236c907220] >>>> 13: (()+0x76ba) [0x7f2366be96ba] >>>> 14: (clone()+0x6d) [0x7f2365c603dd] >>>> NOTE: a copy of the executable, or `objdump -rdS >>>> <executable>` is >>>> needed to interpret this. >>>> >>>> Thanks! >>>> >>>> Ana >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- > Mart van Santen > Greenhost > E: m...@greenhost.nl <mailto:m...@greenhost.nl> > T: +31 20 4890444 <tel:+31%2020%20489%200444> > W: https://greenhost.nl > > A PGP signature can be attached to this e-mail, > you need PGP software to verify it. > My public key is available in keyserver(s) > > PGP Fingerprint: CA85 EB11 2B70 042D AF66 B29A 6437 01A1 10A3 D3A5 > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com