Hello,
I've had a Luminous -> Mimic upgrade go very poorly and my cluster is
stuck with almost all pgs down. One problem is that the mons have
started to re-elect a new quorum leader almost every minute. This is
making it difficult to monitor the cluster and even run any commands on
it since at least half the time a ceph command times out or takes over a
minute to return results. I've looked at the debug logs and it appears
there is some timeout occurring with paxos of about a minute. The
msg_dispatch thread of the mons is often running a core at 100% for
about a minute(user time, no iowait). Running strace on it shows the
process is going through all of the mon db files (about 6gb in
store.db/*.sst). Does anyone have an idea of what this timeout is or why
my mons are always reelecting? One theory I have is that the
msg_dispatch can't process the SST's fast enough and hits some timeout
for a health check and the mon drops itself from the quorum since it
thinks it isn't healthy. I've been thinking of introducing a new mon to
the cluster on hardware with a better cpu to see if that can process the
SSTs within this timeout.
My cluster has the mons,mds,mgr and 30/41 osd servers on mimic, and
11/41 osd servers on luminous. The original problem started when I
restarted the osds on one of the hosts. The cluster reacted poorly to
them going down and went into a frenzy of taking down other osds and
remapping. I eventually got that stable and the PGs were 90% good with
the finish line in sight and then the mons started their issue of
releecting every minute. Now I can't keep any decent amount of PGs up
for more than a few hours. This started on Wednesday.....
Any help would be greatly appreciated.
Thanks,
Kevin
--Debug snippet from a mon at reelection time
2018-09-07 20:08:08.655 7f57b92cd700 20 mon.sephmon2@1(leader).mds
e14242 maybe_resize_cluster in 1 max 1
2018-09-07 20:08:08.655 7f57b92cd700 4 mon.sephmon2@1(leader).mds
e14242 tick: resetting beacon timeouts due to mon delay (slow election?)
of 59.8106s seconds
2018-09-07 20:08:08.655 7f57b92cd700 10
mon.sephmon2@1(leader).paxosservice(mdsmap 13504..14242) maybe_trim
trim_to 13742 would only trim 238 < paxos_service_trim_min 250
2018-09-07 20:08:08.655 7f57b92cd700 10 mon.sephmon2@1(leader).auth
v120657 auth
2018-09-07 20:08:08.655 7f57b92cd700 10 mon.sephmon2@1(leader).auth
v120657 check_rotate updated rotating
2018-09-07 20:08:08.655 7f57b92cd700 10
mon.sephmon2@1(leader).paxosservice(auth 120594..120657) propose_pending
2018-09-07 20:08:08.655 7f57b92cd700 10 mon.sephmon2@1(leader).auth
v120657 encode_pending v 120658
2018-09-07 20:08:08.655 7f57b92cd700 5
mon.sephmon2@1(leader).paxos(paxos updating c 132917556..132918214)
queue_pending_finisher 0x55dce8e5b370
2018-09-07 20:08:08.655 7f57b92cd700 10
mon.sephmon2@1(leader).paxos(paxos updating c 132917556..132918214)
trigger_propose not active, will propose later
2018-09-07 20:08:08.655 7f57b92cd700 4 mon.sephmon2@1(leader).mgr e2234
tick: resetting beacon timeouts due to mon delay (slow election?) of
59.8844s seconds
2018-09-07 20:08:08.655 7f57b92cd700 10
mon.sephmon2@1(leader).paxosservice(mgr 1513..2234) maybe_trim trim_to
1734 would only trim 221 < paxos_service_trim_min 250
2018-09-07 20:08:08.655 7f57b92cd700 10 mon.sephmon2@1(leader).health tick
2018-09-07 20:08:08.655 7f57b92cd700 20 mon.sephmon2@1(leader).health
check_member_health
2018-09-07 20:08:08.657 7f57bcdd0700 1 -- 10.1.9.202:6789/0 >> -
conn(0x55dcee55be00 :6789 s=STATE_ACCEPTING pgs=0 cs=0
l=0)._process_connection sd=447 -
2018-09-07 20:08:08.657 7f57bcdd0700 10 mon.sephmon2@1(leader) e17
ms_verify_authorizer 10.1.9.32:6823/4007 osd protocol 0
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader).health
check_member_health avail 79% total 40 GiB, used 8.4 GiB, avail 32 GiB
2018-09-07 20:08:08.662 7f57b92cd700 20 mon.sephmon2@1(leader).health
check_leader_health
2018-09-07 20:08:08.662 7f57b92cd700 10
mon.sephmon2@1(leader).paxosservice(health 1534..1720) maybe_trim
trim_to 1715 would only trim 181 < paxos_service_trim_min 250
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader).config tick
2018-09-07 20:08:08.662 7f57b92cd700 20 mon.sephmon2@1(leader) e17
sync_trim_providers
2018-09-07 20:08:08.662 7f57b92cd700 -1 mon.sephmon2@1(leader) e17
get_health_metrics reporting 1940 slow ops, oldest is osd_failure(failed
timeout osd.72 10.1.9.9:6800/68904 for 317sec e987498 v987498)
2018-09-07 20:08:08.662 7f57b92cd700 1
mon.sephmon2@1(leader).paxos(paxos updating c 132917556..132918214)
accept timeout, calling fresh election
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader) e17 bootstrap
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader) e17
sync_reset_requester
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader) e17
unregister_cluster_logger
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(leader) e17
cancel_probe_timeout (none scheduled)
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(probing) e17 _reset
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(probing) e17
cancel_probe_timeout (none scheduled)
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(probing) e17
timecheck_finish
2018-09-07 20:08:08.662 7f57b92cd700 15 mon.sephmon2@1(probing) e17
health_tick_stop
2018-09-07 20:08:08.662 7f57b92cd700 15 mon.sephmon2@1(probing) e17
health_interval_stop
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(probing) e17
scrub_event_cancel
2018-09-07 20:08:08.662 7f57b92cd700 10 mon.sephmon2@1(probing) e17
scrub_reset
2018-09-07 20:08:08.662 7f57b92cd700 10
mon.sephmon2@1(probing).paxos(paxos updating c 132917556..132918214)
restart -- canceling timeouts
2018-09-07 20:08:08.662 7f57b92cd700 10
mon.sephmon2@1(probing).paxosservice(mdsmap 13504..14242) restart
2018-09-07 20:08:08.662 7f57b92cd700 10
mon.sephmon2@1(probing).paxosservice(osdmap 917471..987503) restart
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com