Oh, and 2.1.4??? Unless you're on SLES10, please update to a recent Pacemaker version. Not that this will solve this particular problem, you'll just be happier with the result.
On Thu, Dec 9, 2010 at 3:16 PM, Bart Pousson <[email protected]> wrote: > Hi, > > I have a system with two nodes that had been running heartbeat for a > while -- Linux HA 2.1.4. One of the heartbeat processes went to 100% > CPU usage and stayed there, with the following logs seen: > > heartbeat[17464]: 2010/11/21_03:04:07 info: Gmain_timeout_dispatch: > started at 3846010832 should have started at 3845570140 > heartbeat[17464]: 2010/11/21_03:04:08 WARN: Gmain_timeout_dispatch: > Dispatch function for retransmit request took too long to execute: 400 > ms (> 10 ms) (GSource: 0x18254030) > > I tried to shutdown using /etc/init.d/heartbeat stop -- the shutdown > hung and ever since then the only way to stop the heartbeat processes is > by doing a kill (or killall). > > When the heartbeat processes are started, only the first few processes > come up -- heartbeat never fully initializes. The following processes > never come up: > > /usr/lib/heartbeat/ccm > /usr/lib/heartbeat/cib > /usr/lib/heartbeat/lrmd -r > /usr/lib/heartbeat/stonithd > /usr/lib/heartbeat/attrd > /usr/lib/heartbeat/crmd > /usr/lib/heartbeat/mgmtd -v > /usr/lib/heartbeat/cibmon -d > > These logs are now seen every time a start is attempted: > > heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is > filling up (500 messages in queue) > heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is > filling up (500 messages in queue) > heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is > filling up (500 messages in queue) > > So, I've gotten heartbeat into a state where it will not start up all > the processes, and when trying to stop it hangs. I'm not sure what else > to look at. Has anyone seen this kind of behavior before? > > Thanks, > Bart > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
