[Pacemaker] 2-node cluster: one node refusing to join with "not in our membership"

Brent Harsh Mon, 25 Jun 2012 22:35:43 -0700

Seems like bug http://bugs.clusterlabs.org/show_bug.cgi?id=5040 and andearlier thread:http://thread.gmane.org/gmane.linux.highavailability.pacemaker/13185/focus=13321

According to that bug, 1.4.3 may have solved it, yet still open and acomment from Andrew Beekhof saying he'd reproduced again on 4/18. Fromthe thread, maybe pacemaker 1.1.7 with a commit by Andrew, but he seessome behavior.


OS: CentOS 5.7 x86_64
pacemaker 1.1.6
glue: 1.0.9
corosync 1.4.2
 - all RPMs were built from source and stored locally for deployment.

nodes: omc1 and omc2: both virtual machines on CentOS 5.7.

Resources: mainly a floating IP, mysql and httpd along with a few customservices - seemed simple. No shared storage.

This seems like a pretty critical bug. I've not been able to reproduceit in the lab (of course not) but my production cluster is running on asingle cylinder. I do have logs from the event that seemed to cause itif they'd help (prefer pastebin? here on the list?); I've tried to dumpthe collection of logs with crm_report but never seem to wind up withanything in the archives it creates . I'm currently building andtesting 1.4.3 but since I can't reproduce, I'm less than thrilled aboutthe prospects and feeling confident.

Secondly - any recommended process to bring the messed up node back intothe cluster game? I've probably horked it beyond recognition withshutdowns/crm commands/rm crm configs, editing the cib with cibadmin andtrying to replace it based on other threads and advice. I currentlyhave pacemaker and corosync services shut off - too terrifying tocontemplate it killing my active node by interacting with it.


Let me know what info would help...

Thanks,

Brent

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] 2-node cluster: one node refusing to join with "not in our membership"

Reply via email to