Hi, this is my first post on this list. I hope I put my question to the correct mailing-list.
I have installed Pacemaker/Corosync on two Ubuntu-Lucid Servers building a two node cluster. This cluster shall become a router for a datacenter. I installed the distribution provided packages. I guess version 1.0.8. The cluster is set up so far and it seems to work. It seems, because sometimes one of the resources does not start and this is shown in the logs as unknown error. The error also is very random, like rolling the dice. But first of all, here is my crm config: node bgwnode1 \ attributes standby="off" node bgwnode2 \ attributes standby="off" primitive resIPdatacenter ocf:heartbeat:IPaddr2 \ meta migration-threshold="3" \ op monitor interval="10s" timeout="20s" \ params ip="10.0.0.1" nic="eth3" cidr_netmask="8" primitive resIPoffice ocf:heartbeat:IPaddr2 \ meta migration-threshold="3" \ op monitor interval="10s" timeout="20s" \ params ip="192.168.20.1" nic="eth3" cidr_netmask="24" primitive resIPsubnet1 ocf:heartbeat:IPaddr2 \ meta migration-threshold="3" \ op monitor interval="10s" timeout="20s" \ params ip="213.252.188.1" nic="eth3" cidr_netmask="25" primitive resIPtransfer ocf:heartbeat:IPaddr2 \ meta migration-threshold="3" \ op monitor interval="10s" timeout="20s" \ params ip="212.68.95.210" nic="eth2" cidr_netmask="30" primitive resPing ocf:heartbeat:pingd \ params host_list="172.16.1.1 172.16.1.2" dampen="5s" multiplier="100" primitive resRouteWANbcc ocf:heartbeat:Route \ meta migration-threshold="3" \ op monitor interval="10s" timeout="20s" \ params destination="0.0.0.0/0" device="eth2" gateway="212.68.95.209" primitive resSysInfo ocf:heartbeat:SysInfo \ op monitor interval="10s" clone clonePing resPing clone cloneSysInfo resSysInfo location locNetServices resIPtransfer \ rule $id="locNetServices-rule" pingd: defined pingd xml <rsc_colocation id="totalColoc" score="INFINITY"> \ <resource_set id="orderSetup-30bacef5" sequential="true"> \ <resource_ref id="resIPtransfer"/> \ <resource_ref id="resIPsubnet1"/> \ <resource_ref id="resIPoffice"/> \ <resource_ref id="resIPdatacenter"/> \ <resource_ref id="resRouteWANbcc"/> \ </resource_set> \ </rsc_colocation> property $id="cib-bootstrap-options" \ dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" rsc_defaults $id="rsc-options" \ resource-stickiness="INFINITY" The resource "resRouteWANbcc" sometimes does not start and I really don't know why. I thought that the resource_set would start eache resource one-by-one and only would start later resources if early resources started successfully. The route belongs to "resIPtransfer" which should have been up as first resource. I also thought about adding a ocf:heartbeat:Delay resource, but this did not work. I also thought that the interface might take too long because of AutoNeg media detection, so I configured the interfaces appropriate. This does not fix the problem as well. Unfortunately if the default route is not HA, then the whole setup isn't. And a second problem is detecting an unplugged cable. I realized that crm triggers the ifconfig up/down state. So I simply installed ifplugd to monitor the ports and automatically bring interfaces up and down: ARGS="-q -p -f -u0 -d0 -w -I -m ethtool" But this also works only sometimes. So currently I am a little bit stuck :-) Of some of you had some beginners tips for me, I appreciate that very much. Thanks in advance Christian Roessner -- Roessner-Network-Solutions Bachelor of Science Informatik 50°34.725'N, 08°40.904'O, Nahrungsberg 81, 35390 Giessen F: +49 641 5879091, M: +49 176 93118939 USt-IdNr.: DE225643613 http://www.roessner-network-solutions.com
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker