[ceph-users] Re: ceph mons stuck in electing state

2019-09-03 Thread huang jun
can you set debug_mon=20 and debug_paxos=20 and debug_ms=1 on all mon and get log? Ashley Merrick 于2019年9月3日周二 下午9:35写道: > > What change did you make in ceph.conf > > Id check that hasn't caused an issue first. > > > On Tue, 27 Aug 2019 04:37:15 +0800 nkern...@gmail.com wrote > > Hello,

[ceph-users] Re: PG is stuck in repmapped and degraded

2019-10-08 Thread huang jun
seems like hit this: https://tracker.ceph.com/issues/41190 展荣臻(信泰) 于2019年10月8日周二 上午10:26写道: > > >If the journal is no longer readable: the safe variant is to > >completely re-create the OSDs after replacing the journal disk. (The > >unsafe way to go is to just skip the --flush-journal part, not >

[ceph-users] Re: Fwd: HeartbeatMap FAILED assert(0 == "hit suicide timeout")

2019-10-09 Thread huang jun
If you got a coredump file, then you should check why the thread takes so long to have a job done. 潘东元 于2019年10月10日周四 上午10:51写道: > > hi all, > my osd hit suicide timeout. > some log: > 2019-10-10 03:53:13.017760 7f1ab886e700 0 -- 192.168.1.5:6810/1028846 > >> 192.168.1.25:6802/24020795 p

[ceph-users] Re: ceph mon failed to start

2019-10-22 Thread huang jun
Try this https://docs.ceph.com/docs/mimic/man/8/ceph-kvstore-tool/ and use the 'repair' operation 徐蕴 于2019年10月22日周二 下午3:51写道: > > Hi, > > Our cluster got an unexpected power outage. Ceph mon cannot start after that. > The log shows: > > Running command: '/usr/bin/ceph-mon -f -i 10.10.198.11 --pu

[ceph-users] Re: EC PGs stuck activating, 2^31-1 as OSD ID, automatic recovery not kicking in

2019-11-24 Thread huang jun
How many PGs in the pools? This maybe the CRUSH can not get the proper OSD. you can check param tunable choose_total_tries in crush tunables, try increase it like this: ceph osd getcrushmap -o crush crushtool -d crush -o crush.txt sed -i 's/tunable choose_total_tries 50/tunable choose_total_tries