Le 28/10/2010 18:30, Guillaume Chanaud a écrit :
 Le 28/10/2010 17:55, Pavlos Parissis a écrit :
On 28 October 2010 16:09, Guillaume Chanaud
<guillaume.chan...@connecting-nature.com>  wrote:
  Hello,

i have a cluster of two master/slave drbd server running into a vlan
(machines are dedicated servers)
(filer1 and filer2)
I added a third node to the cluster (a "blank node" for the moment)
correctly
(server1)
When i add a 4th node to the cluster (which is a "mirror" of server1)
(server2)
this node start as standalone...Here is the message.log :

Oct 28 15:59:27 ns209045 corosync[16543]: [TOTEM ] A processor joined or
left the membership and a new membership was formed.
Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] notice:
pcmk_peer_update: Transitional membership event on ring 945392: memb=1,
new=0, lost=0
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
memb: server2 16820416
Oct 28 15:59:28 ns209045 corosync[16543]:   [pcmk  ] notice:
pcmk_peer_update: Stable membership event on ring 945392: memb=1, new=0,
lost=0
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
MEMB: server2 16820416
Oct 28 15:59:28 ns209045 corosync[16543]: [TOTEM ] A processor joined or
left the membership and a new membership was formed.
Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] notice:
pcmk_peer_update: Transitional membership event on ring 945416: memb=1,
new=0, lost=0
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
memb: server2 16820416
Oct 28 15:59:29 ns209045 corosync[16543]:   [pcmk  ] notice:
pcmk_peer_update: Stable membership event on ring 945416: memb=1, new=0,
lost=0
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
MEMB: server2 16820416

[...] Message repeat many many times

Now i stop the server1, and i start the server2...server2 start correctly
and is added to the cluster...but when
i want to start server1, same thing happens...(so things are inverted but result is the same...when i start one the serverX, the other can't start...)

My corosync.conf is configured in broadcast, not multicast....I have lots of
problem with multicast because lots of briged VM on the vlan
doesn't see the multicast packets, or doesn't join the multicast group
correctly...

Any hint on this ??
corosync and auth files are the same on server2?


Yes of course :D (copied by scp), as i told server1 can join when server2 is offline, and server 2 can join when server1 is offline, but if one is online, the other can't join and log the above things in loop...

In fact i have loooooooottttttssssss of problem with corosync/pacemaker...multicast/broadcast between physical servers/virtual....lots of different shit everywhere, error log are always different depending on what i try...

The strange things is that the filer1 filer2 server2 and server1 are all running the same distro (gentoo) with same tools and are on the same vlan (which is working for lots of services like nfs...)
Another things i've just seen...
When one of the server1/server2 connect to the cluster, log start to fill with this message on all nodes :

ct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 28 18:46:01 filer2 corosync[10928]: [MAIN ] Completed service synchronization, ready to provide service. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 513 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 513 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 513 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 513 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 572 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 572 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 573 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 1162480: memb=3, new=0, lost=0 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: memb: server1 16820416 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: memb: filer1 83929280 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: memb: filer2 100706496 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 1162480: memb=3, new=0, lost=0 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: MEMB: server1 16820416 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: MEMB: filer1 83929280 Oct 28 18:46:01 filer2 corosync[10928]: [pcmk ] info: pcmk_peer_update: MEMB: filer2 100706496 Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Oct 28 18:46:01 filer2 corosync[10928]: [MAIN ] Completed service synchronization, ready to provide service. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 632 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 632 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 632 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 692 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 692 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 751 ms, flushing membership messages. Oct 28 18:46:01 filer2 corosync[10928]: [TOTEM ] Process pause detected for 751 ms, flushing membership messages.

which is not the case when filer1/filer2 are the only nodes of the cluster...

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to