Hi All,

I deployed Pacemaker 1.1.12-rc2 on our platform to test the cib changes.
This was needed on our setup as it contains 6 nodes, 150 resources and the cib process was using lots of cpu.

With a limited set of resources (6 nodes, 30 resources) everything worked as expected, including crm_mon. When loading the complete set of resources we lost the crm_mon functionality on all nodes. The cluster is running as expected (running all resources) however we don't have any visibility.

I noticed that operations performing changes did actually work like (crm resource stop <resourcename>),
but crm resource status didn't work (using crmsh-2.0+git46-1.1.x86_64).

I noticed that /dev/shm/qb-cib_ro* files are created, and lsof shows that they are both opened by crm_mon and cib.


When executing "crm_mon -1" I get following messages in /var/log/messages (and /var/log/pacemaker.log) Jun 12 12:47:38 [8062] SRV-5-1 cib: notice: crm_ipcs_sendv: Response 2 to 0x1810370[17836] (1091618 bytes) failed: Resource temporarily unavailable (-11) Jun 12 12:47:38 [8062] SRV-5-1 cib: warning: do_local_notify: Sync reply to crm_mon failed: No message of desired type


Restarting the pacemaker and cman service of 1 node didn't solve it.


What is causing this problem and how can I resolve it ?


Thx,
Johan Huysmans

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to