Are you using: > service { > # Load the Pacemaker Cluster Resource Manager > name: pacemaker > ver: 1 > }
for all of the nodes? On Wed, Aug 17, 2011 at 8:27 AM, Gabriel Gomiz <ggo...@cooperativaobrera.com.ar> wrote: > Hi to all... :) > > We are experiencing some difficulties with a pacemaker 4 node cluster. 3 > nodes are ok but a 4th node, after some corosync failures (with core dumps) > and pacemaker restarts included, does not returns to cluster. > > In the other 3 nodes the 4th appears online, but in the 4th node there is a > empty cib when I display crm. > > Something weird in the logs is this kind of messages: > > Aug 16 19:07:15 lorien.cooperativaobrera.com.ar cib: [28120]: WARN: > cib_peer_callback: Discarding cib_modify message (421) from > mordor.cooperativaobrera.com.ar: not in our membership > > It seems as the 4th node is not considering itself as a member of the > cluster. How can I rejoin the member again? > > Any help you cah give me will be highly appreciated. > > Many thanks in advance > > PD: If you need any additional logs, tests I can make, etc. I'm willing to > make it. > > ----- > > DATA: > > OS is CENTOS 6.0 64 bits > PACEMAKER version 1.1.5 > COROSYNC 1.2.3-21 > > NODE 1: > > [DB1] gandalf # crm_mon -1 > ============ > Last updated: Tue Aug 16 19:21:05 2011 > Stack: openais > Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum > Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f > 4 Nodes configured, 4 expected votes > 1 Resources configured. > ============ > > Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar > mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ] > > Resource Group: dashboard > fs_dashboard (ocf::heartbeat:Filesystem): Started > isildur.cooperativaobrera.com.ar > ip_dashboard (ocf::heartbeat:IPaddr): Started > isildur.cooperativaobrera.com.ar > srv_httpd_dashboard (lsb:httpd.dashboard): Started > isildur.cooperativaobrera.com.ar > srv_dashjobs (lsb:dashjobs): Started > isildur.cooperativaobrera.com.ar > > NODE 2: > > [DB2] isildur # crm_mon -1 > ============ > Last updated: Tue Aug 16 19:21:28 2011 > Stack: openais > Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum > Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f > 4 Nodes configured, 4 expected votes > 1 Resources configured. > ============ > > Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar > mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ] > > Resource Group: dashboard > fs_dashboard (ocf::heartbeat:Filesystem): Started > isildur.cooperativaobrera.com.ar > ip_dashboard (ocf::heartbeat:IPaddr): Started > isildur.cooperativaobrera.com.ar > srv_httpd_dashboard (lsb:httpd.dashboard): Started > isildur.cooperativaobrera.com.ar > srv_dashjobs (lsb:dashjobs): Started > isildur.cooperativaobrera.com.ar > > NODE 3: > > [VM1] mordor # crm_mon -1 > ============ > Last updated: Tue Aug 16 19:21:40 2011 > Stack: openais > Current DC: gandalf.cooperativaobrera.com.ar - partition with quorum > Version: 1.1.5-1.1.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f > 4 Nodes configured, 4 expected votes > 1 Resources configured. > ============ > > Online: [ isildur.cooperativaobrera.com.ar gandalf.cooperativaobrera.com.ar > mordor.cooperativaobrera.com.ar lorien.cooperativaobrera.com.ar ] > > Resource Group: dashboard > fs_dashboard (ocf::heartbeat:Filesystem): Started > isildur.cooperativaobrera.com.ar > ip_dashboard (ocf::heartbeat:IPaddr): Started > isildur.cooperativaobrera.com.ar > srv_httpd_dashboard (lsb:httpd.dashboard): Started > isildur.cooperativaobrera.com.ar > srv_dashjobs (lsb:dashjobs): Started > isildur.cooperativaobrera.com.ar > > NODE 4: > > [VM2] lorien # crm_mon -1 > ============ > Last updated: Tue Aug 16 19:21:54 2011 > Current DC: NONE > 0 Nodes configured, unknown expected votes > 0 Resources configured. > ============ > > LOGS ON NODE 4: > > <attached> > > CONFIG COROSYNC (NODE 4, other nodes are the same but changing bindnetaddr): > > compatibility: whitetank > > totem { > version: 2 > secauth: off > threads: 0 > interface { > ringnumber: 0 > bindnetaddr: 192.168.238.43 > mcastaddr: 226.94.2.1 > mcastport: 5405 > } > } > > logging { > fileline: off > to_stderr: no > to_logfile: yes > to_syslog: yes > logfile: /var/log/cluster/corosync.log > debug: off > timestamp: on > logger_subsys { > subsys: AMF > debug: off > } > } > > amf { > mode: disabled > } > > service { > # Load the Pacemaker Cluster Resource Manager > name: pacemaker > ver: 1 > } > > -- > .^. Lic. Gabriel Gomiz - Red Hat Certified Engineer (RHCE) > /V\ Jefe de Sistemas - Administrador Red y Servidores > // \\ Gerencia de Sistemas - Cooperativa Obrera Ltda. > /( )\ Tel (0291) 456-0084 > ^^-^^ s/Window[$s]/LINUX!!/g or die; > > PGP: http://admin.cooperativaobrera.com.ar/pgp/ggomiz.txt > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker