On 12 Jun 2014, at 10:53 pm, Johan Huysmans <johan.huysm...@inuits.be> wrote:

> Hi All,
> 
> I deployed Pacemaker 1.1.12-rc2 on our platform to test the cib changes.
> This was needed on our setup as it contains 6 nodes, 150 resources and the 
> cib process was using lots of cpu.
> 
> With a limited set of resources (6 nodes, 30 resources) everything worked as 
> expected, including crm_mon.
> When loading the complete set of resources we lost the crm_mon functionality 
> on all nodes.
> The cluster is running as expected (running all resources) however we don't 
> have any visibility.
> 
> I noticed that operations performing changes did actually work like (crm 
> resource stop <resourcename>),
> but crm resource status didn't work (using crmsh-2.0+git46-1.1.x86_64).
> 
> I noticed that /dev/shm/qb-cib_ro* files are created, and lsof shows that 
> they are both opened by crm_mon and cib.
> 
> 
> When executing "crm_mon -1" I get following messages in /var/log/messages 
> (and /var/log/pacemaker.log)
> Jun 12 12:47:38 [8062] SRV-5-1        cib:   notice: crm_ipcs_sendv:     
> Response 2 to 0x1810370[17836] (1091618 bytes) failed: Resource temporarily 
> unavailable (-11)
> Jun 12 12:47:38 [8062] SRV-5-1        cib:  warning: do_local_notify:     
> Sync reply to crm_mon failed: No message of desired type
> 
> 
> Restarting the pacemaker and cman service of 1 node didn't solve it.
> 
> 
> What is causing this problem and how can I resolve it ?

Almost certainly you're hitting IPC limits associated with large clusters.

You should be able to tune:

# PCMK_ipc_buffer=20480

In /etc/sysconfig/pacemaker and then restart the cluster.

Note also:

# For non-systemd based systems, prefix 'export' to each enabled line


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to