Oh, and 2.1.4???
Unless you're on SLES10, please update to a recent Pacemaker version.
Not that this will solve this particular problem, you'll just be
happier with the result.

On Thu, Dec 9, 2010 at 3:16 PM, Bart Pousson
<[email protected]> wrote:
> Hi,
>
> I have a system with two nodes that had been running heartbeat for a
> while -- Linux HA 2.1.4.  One of the heartbeat processes went to 100%
> CPU usage and stayed there, with the following logs seen:
>
> heartbeat[17464]: 2010/11/21_03:04:07 info: Gmain_timeout_dispatch:
> started at 3846010832 should have started at 3845570140
> heartbeat[17464]: 2010/11/21_03:04:08 WARN: Gmain_timeout_dispatch:
> Dispatch function for retransmit request took too long to execute: 400
> ms (> 10 ms) (GSource: 0x18254030)
>
> I tried to shutdown using /etc/init.d/heartbeat stop  -- the shutdown
> hung and ever since then the only way to stop the heartbeat processes is
> by doing a kill (or killall).
>
> When the heartbeat processes are started, only the first few processes
> come up -- heartbeat never fully initializes. The following processes
> never come up:
>
>    /usr/lib/heartbeat/ccm
>    /usr/lib/heartbeat/cib
>    /usr/lib/heartbeat/lrmd -r
>    /usr/lib/heartbeat/stonithd
>    /usr/lib/heartbeat/attrd
>    /usr/lib/heartbeat/crmd
>    /usr/lib/heartbeat/mgmtd -v
>    /usr/lib/heartbeat/cibmon -d
>
> These logs are now seen every time a start is attempted:
>
> heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is
> filling up (500 messages in queue)
> heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is
> filling up (500 messages in queue)
> heartbeat[12339]: 2010/12/08_16:20:23 ERROR: Message hist queue is
> filling up (500 messages in queue)
>
> So, I've gotten heartbeat into a state where it will not start up all
> the processes, and when trying to stop it hangs.  I'm not sure what else
> to look at.  Has anyone seen this kind of behavior before?
>
> Thanks,
> Bart
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to