Just a quick update - kill -9 on the ocfs2_controld.pcmk process
silenced the issue. Corosync is quiet again. Logs reported that
syslogd was rate-limiting incoming log messages from ocfs2_controld.
On 05/11/2012 10:19 AM, Matthew O'Connor wrote:
This might be the wrong place to ask this question, but between
corosync, OCFS2 and Pacemaker, under what circumstances would corosync
start consuming mass quantities of CPU? It started last night around
30%, and this morning is up around 110%. 40% is user, 47% system.
ocfs2_controld.pcmk is running at 50% CPU utilization. I've seen this
before on one of my other test-lab clusters, as though something is
spinlocking when there is a failure with some communication between
the (at present) two nodes. Indeed, last night I was testing some
failover cases, and the problem actually seemed to start after
hard-rebooting one of the members.
Using Pacemaker 1.1.5, Corosync 1.3.0, and ocfs2_controld (Aug 19
2011) on Ubuntu Server 11.10 amd64.
Thanks!!
-- Matt
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org