Hi, I am trying to set up a simple cluster of 2 machines on CentOS 6.4: pacemaker-cli-1.1.10-1.el6_4.4.x86_64 pacemaker-1.1.10-1.el6_4.4.x86_64 pacemaker-libs-1.1.10-1.el6_4.4.x86_64 pacemaker-cluster-libs-1.1.10-1.el6_4.4.x86_64 corosync-1.4.1-15.el6_4.1.x86_64 corosynclib-1.4.1-15.el6_4.1.x86_64 pcs-0.9.90-1.el6_4.noarch cman-3.0.12.1-49.el6_4.2.x86_64 resource-agents-3.9.2-21.el6_4.8.x86_64
I am using the following script to configure the cluster: -------------------------------------------------- #!/bin/bash CLUSTER_NAME=test CONFIG_FILE=/etc/cluster/cluster.conf NODE1_EM1=node1 NODE2_EM1=node2 NODE1_EM2=node1-priv NODE2_EM2=node2-priv VIP=192.168.0.6 MONITOR_INTERVAL=60s # Make sure that pacemaker is stopped on both nodes # NOT INCLUDED HERE # Delete existing configuration rm -rf /var/log/cluster/* ssh root@$NODE2_EM2 'rm -rf /var/log/cluster/*' rm -rf /var/lib/pacemaker/cib/* /var/lib/pacemaker/cores/* /var/lib/pacemaker/pengine/* /var/lib/corosync/* /var/lib/cluster/* ssh root@$NODE2_EM2 'rm -rf /var/lib/pacemaker/cib/* /var/lib/pacemaker/cores/* /var/lib/pacemaker/pengine/* /var/lib/corosync/* /var/lib/cluster/*' # Create the cluster ccs -f $CONFIG_FILE --createcluster $CLUSTER_NAME # Add nodes to the cluster ccs -f $CONFIG_FILE --addnode $NODE1_EM1 ccs -f $CONFIG_FILE --addnode $NODE2_EM1 ccs -f $CONFIG_FILE --setcman two_node="1" expected_votes="1" # Add alternative nodes name so that both network interfaces are used ccs -f $CONFIG_FILE --addalt $NODE1_EM1 $NODE1_EM2 ccs -f $CONFIG_FILE --addalt $NODE2_EM1 $NODE2_EM2 ccs -f $CONFIG_FILE --setdlm protocol="sctp" # Teach CMAN how to send it's fencing requests to Pacemaker ccs -f $CONFIG_FILE --addfencedev pcmk agent=fence_pcmk ccs -f $CONFIG_FILE --addmethod pcmk-redirect $NODE1_EM1 ccs -f $CONFIG_FILE --addmethod pcmk-redirect $NODE2_EM1 ccs -f $CONFIG_FILE --addfenceinst pcmk $NODE1_EM1 pcmk-redirect port=$NODE1_EM1 ccs -f $CONFIG_FILE --addfenceinst pcmk $NODE2_EM1 pcmk-redirect port=$NODE2_EM1 # Deploy configuration to node2 scp /etc/cluster/cluster.conf root@$NODE2_EM2:/etc/cluster/cluster.conf # Start pacemaker on main node /etc/init.d/pacemaker start sleep 30 # Disable stonith pcs property set stonith-enabled=false # Disable quorum pcs property set no-quorum-policy=ignore # Define ressources pcs resource create VIP_EM1 ocf:heartbeat:IPaddr params nic=em1 ip=$VIP_EM1 cidr_netmask=24 op monitor interval=$MONITOR_INTERVAL pcs resource create PREFERRED_SRC_IP ocf:heartbeat:IPsrcaddr params ipaddress=$VIP_EM1 op monitor interval=$MONITOR_INTERVAL # Define initial location and prevent ressources to go back to initial server after a failure pcs resource defaults resource-stickiness=100 pcs constraint location VIP_EM1 prefers $NODE1_EM1=50 -------------------------------------------------- After running this script from node1: root@node1# pcs status Cluster name: Last updated: Wed Nov 6 17:17:30 2013 Last change: Wed Nov 6 17:06:20 2013 via crm_attribute on node1 Stack: cman Current DC: node1 - partition with quorum Version: 1.1.10-1.el6_4.4-368c726 2 Nodes configured 2 Resources configured Online: [ node1 ] OFFLINE: [ node2 ] Full list of resources: VIP_EM1 (ocf::heartbeat:IPaddr): Stopped PREFERRED_SRC_IP (ocf::heartbeat:IPsrcaddr): Stopped Failed actions: PREFERRED_SRC_IP_start_0 on node1 'unknown error' (1): call=19, status=complete, last-rc-change='Wed Nov 6 17:06:20 2013', queued=67ms, exec=0ms PCSD Status: Error: no nodes found in corosync.conf root@node1# ip route show 192.168.8.0/24 dev em2 proto kernel scope link src 192.168.8.1 default via 192.168.0.1 dev em1 Error in /var/log/cluster/corosync.log: ... IPsrcaddr(PREFERRED_SRC_IP)[638]: 2013/11/06_16:50:32 ERROR: command 'ip route change to default via 192.168.0.1 dev em1 src 192.168.0.6' failed Nov 06 16:50:32 [32461] node1.domain.org lrmd: notice: operation_finished: PREFERRED_SRC_IP_start_0:638:stderr [ RTNETLINK answers: No such process ] ... If I run the command manually when pacemaker is not started (after rebooting the machine for example), the default route is modified as expected (I use 192.168.0.106 because the alias 192.168.0.6 is not started) # ip route show 192.168.0.0/24 dev em1 proto kernel scope link src 192.168.0.106 192.168.8.0/24 dev em3 proto kernel scope link src 192.168.8.1 default via 192.168.0.1 dev em1 # ip route change to default via 192.168.0.1 dev em1 src 192.168.0.106 # ip route show 192.168.0.0/24 dev em1 proto kernel scope link src 192.168.0.106 192.168.8.0/24 dev em3 proto kernel scope link src 192.168.8.1 default via 192.168.0.1 dev em1 src 192.168.0.106 If I run the same configure script without defining the PREFERRED_SRC_IP resource, I can check that the resource is started as expected: # pcs status ... Online: [ node1 ] OFFLINE: [ node2 ] Full list of resources: VIP_EM1 (ocf::heartbeat:IPaddr): Started node1 ... # ip addr show em1 6: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff inet 192.168.0.106/24 brd 192.168.0.255 scope global em1 inet 192.168.0.6/24 brd 192.168.0.255 scope global secondary em1 But when I create the PREFERRED_SRC_IP resource, I get the same error: # pcs resource create PREFERRED_SRC_IP ocf:heartbeat:IPsrcaddr params ipaddress=192.168.0.6 op monitor interval=60s # pcs status ... Online: [ node1 ] OFFLINE: [ node2 ] Full list of resources: VIP_EM1 (ocf::heartbeat:IPaddr): Started node1 PREFERRED_SRC_IP (ocf::heartbeat:IPsrcaddr): Stopped Failed actions: PREFERRED_SRC_IP_start_0 on node1 'unknown error' (1): call=24, status=complete, last-rc-change='Wed Nov 6 18:00:09 2013', queued=47ms, exec=0ms Error in corosync.log: IPsrcaddr(PREFERRED_SRC_IP)[10035]: 2013/11/06_18:00:09 ERROR: command 'ip route change to default via 192.168.0.1 dev em1 src 192.168.0.6' failed Nov 06 18:00:09 [9172] node1.domain.org lrmd: notice: operation_finished: PREFERRED_SRC_IP_start_0:10035:stderr [ RTNETLINK answers: No such process ] Thanks in advance, Mathieu _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org