Michael Schwartzkopff wrote:
Am Samstag, 26. Dezember 2009 11:27:54 schrieb Eric Renfro:
Michael Schwartzkopff wrote:
Am Samstag, 26. Dezember 2009 10:52:38 schrieb Eric Renfro:
Michael Schwartzkopff wrote:
Am Samstag, 26. Dezember 2009 08:12:49 schrieb Eric Renfro:
Hello,

I'm trying to setup 2 nodes that'll run pacemaker with openais as the
communication layer. Ideally what I want is for router1 to be the
master node and take over for router2 if it comes back up fully
functional again. In my setup, the routers are both internet-facing
servers that toggle the external internet IP to whichever controls it
at the time, and also handles the internal IP for the gateway for
internal systems to route via.

My problem is with Route in my setup, so far, and later getting
shorewall to start/stop per whichever nodes active.

Route, in my case in the setup I will show below, is failing to start
initially because I presume the internet IP address is not fully
initialized at the time it's trying to enable the route. If I do a crm
resource cleanup failover-gw, it brings it up just fine. If I try to
move the router_cluster resource to router2 from router1 after it's
fully up, it fails because of failover-gw on router2.
Very unlikely. If the IPaddr2 script finishes the IP address is up.
Please search for other reasons and grep "lrm.*failover-gw" in the
logs.

Here's my setup at present. For the moment, until I figure out how to
do it, shorewall is started manually, I want to automate this once the
setup is working, though, perhaps you guys could help me with that as
well.

primitive failover-int-ip ocf:heartbeat:IPaddr2 \
        params ip="192.168.0.1" \
        op monitor interval="2s"
primitive failover-ext-ip ocf:heartbeat:IPaddr2 \
        params ip="24.227.124.158" cidr_netmask="30"
broadcast="24.227.124.159" nic="net0" \
        op monitor interval="2s" \
        meta target-role="Started"
primitive failover-gw ocf:heartbeat:Route \
        params destination="0.0.0.0/0" gateway="24.227.124.157"
device="net0" \
        meta target-role="Started" \
        op monitor interval="2s"
group router_cluster failover-int-ip failover-ext-ip failover-gw
location router-master router_cluster \
        rule $id="router-master-rule" $role="master" 100: #uname eq
router1

I would appreciate as much help as possible. I am fairly new to
pacemaker, but so far all but the Route part of this works well.
Please give us a chance to help you providing the interesting logs!
Sure..
Here's a big clip of a log grepped from just failover-gw, if this helps
hopefully, else, I can pinpoint more around what's happening, the logs
fill up pretty quickly as it's coming alive.

messages:Dec 26 02:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
(...)

The rest of the logs is not needed. Just the first line tells you that
that something is not installed correctly. Please read the lines just
abobe this line. Normally it tells you what is missing.

You also your read trough the routing resource agent in
/usr/lib/ocf/resource.d/heartbeat/Route

Greetings,
Hmmm..
I'm not seeing anything about it, here's a clip of the above lines, and
one line below the one saying (not installed).

Dec 26 05:00:21 router1 pengine: [4724]: info: determine_online_status:
Node router1 is online
Dec 26 05:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expect
ed value: 7 (not running)
Dec 26 05:00:21 router1 pengine: [4724]: WARN: unpack_rsc_op: Operation
failover-gw_monitor_0 found resource failover-gw active on r
outer1
Dec 26 05:00:21 router1 pengine: [4724]: info: determine_online_status:
Node router2 is online
Dec 26 05:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
 the expected value: 7 (not running)
Dec 26 05:00:21 router1 pengine: [4724]: ERROR: unpack_rsc_op: Hard
error - failover-gw_monitor_0 failed with rc=5: Preventing failover-gw
from re-starting on router2

Hi,

there must be other log entries. In the Router RA I have before err out the agent write reasons into the ocf_log(). What version of pacemaker and cluster-
glue do you have? What distribution you a running on?

Greetings,

I've checked all my logs. Syslog logs everything to my messages logfile, so it should be there if anywhere.

I'm running OpenSUSE 11.2 which comes with heartbeat 2.99.3, pacemaker 1.0.1, openais 0.80.3, as to what all's running in this setup.

--
Eric Renfro


_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to