So last night I was supposed to get a cluster running, everything worked ok on a virtual environment using the same software and by my experience I only had to install pacemaker and corosync (from the ubuntu 10.04 ppa) and get it rolling. What really happened was: I could use crm configure to set properties to the cluster like resource stickiness and quorum and disable stonith. When I tried to add primitives, the crm just hang there, without returning an error or completing. I noticed those two entries in the log, everytime crm tries to configure something the first time:
Nov 30 05:33:26 server lrmd: [18102]: debug: on_msg_register:client lrmadmin [18159] registered Nov 30 05:33:26 server lrmd: [18102]: debug: on_receive_cmd: the IPC to client [pid:18159] disconnected. Also, when I stop corosync it sends a TERM signal for lrmd but it doesn't exit, even after some minutes, I have to kill -9 it. I tried to strace lrmd but it's stuck on a FUTEX that really doesn't really help a lot: Process 32764 attached - interrupt to quit futex(0xe070d8, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...> Anyone has any idea what would make lrmd to just hang? []s core _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org