Am Samstag, 26. Dezember 2009 11:55:57 schrieb Eric Renfro: > Michael Schwartzkopff wrote: > > Am Samstag, 26. Dezember 2009 11:27:54 schrieb Eric Renfro: > >> Michael Schwartzkopff wrote: > >>> Am Samstag, 26. Dezember 2009 10:52:38 schrieb Eric Renfro: > >>>> Michael Schwartzkopff wrote: > >>>>> Am Samstag, 26. Dezember 2009 08:12:49 schrieb Eric Renfro: > >>>>>> Hello, > >>>>>> > >>>>>> I'm trying to setup 2 nodes that'll run pacemaker with openais as > >>>>>> the communication layer. Ideally what I want is for router1 to be > >>>>>> the master node and take over for router2 if it comes back up fully > >>>>>> functional again. In my setup, the routers are both internet-facing > >>>>>> servers that toggle the external internet IP to whichever controls > >>>>>> it at the time, and also handles the internal IP for the gateway for > >>>>>> internal systems to route via. > >>>>>> > >>>>>> My problem is with Route in my setup, so far, and later getting > >>>>>> shorewall to start/stop per whichever nodes active. > >>>>>> > >>>>>> Route, in my case in the setup I will show below, is failing to > >>>>>> start initially because I presume the internet IP address is not > >>>>>> fully initialized at the time it's trying to enable the route. If I > >>>>>> do a crm resource cleanup failover-gw, it brings it up just fine. If > >>>>>> I try to move the router_cluster resource to router2 from router1 > >>>>>> after it's fully up, it fails because of failover-gw on router2. > >>>>> > >>>>> Very unlikely. If the IPaddr2 script finishes the IP address is up. > >>>>> Please search for other reasons and grep "lrm.*failover-gw" in the > >>>>> logs. > >>>>> > >>>>>> Here's my setup at present. For the moment, until I figure out how > >>>>>> to do it, shorewall is started manually, I want to automate this > >>>>>> once the setup is working, though, perhaps you guys could help me > >>>>>> with that as well. > >>>>>> > >>>>>> primitive failover-int-ip ocf:heartbeat:IPaddr2 \ > >>>>>> params ip="192.168.0.1" \ > >>>>>> op monitor interval="2s" > >>>>>> primitive failover-ext-ip ocf:heartbeat:IPaddr2 \ > >>>>>> params ip="24.227.124.158" cidr_netmask="30" > >>>>>> broadcast="24.227.124.159" nic="net0" \ > >>>>>> op monitor interval="2s" \ > >>>>>> meta target-role="Started" > >>>>>> primitive failover-gw ocf:heartbeat:Route \ > >>>>>> params destination="0.0.0.0/0" gateway="24.227.124.157" > >>>>>> device="net0" \ > >>>>>> meta target-role="Started" \ > >>>>>> op monitor interval="2s" > >>>>>> group router_cluster failover-int-ip failover-ext-ip failover-gw > >>>>>> location router-master router_cluster \ > >>>>>> rule $id="router-master-rule" $role="master" 100: #uname eq > >>>>>> router1 > >>>>>> > >>>>>> I would appreciate as much help as possible. I am fairly new to > >>>>>> pacemaker, but so far all but the Route part of this works well. > >>>>> > >>>>> Please give us a chance to help you providing the interesting logs! > >>>> > >>>> Sure.. > >>>> Here's a big clip of a log grepped from just failover-gw, if this > >>>> helps hopefully, else, I can pinpoint more around what's happening, > >>>> the logs fill up pretty quickly as it's coming alive. > >>>> > >>>> messages:Dec 26 02:00:21 router1 pengine: [4724]: info: unpack_rsc_op: > >>>> failover-gw_monitor_0 on router2 returned 5 (not installed) instead of > >>>> the expected value: 7 (not running) > >>> > >>> (...) > >>> > >>> The rest of the logs is not needed. Just the first line tells you that > >>> that something is not installed correctly. Please read the lines just > >>> abobe this line. Normally it tells you what is missing. > >>> > >>> You also your read trough the routing resource agent in > >>> /usr/lib/ocf/resource.d/heartbeat/Route > >>> > >>> Greetings, > >> > >> Hmmm.. > >> I'm not seeing anything about it, here's a clip of the above lines, and > >> one line below the one saying (not installed). > >> > >> Dec 26 05:00:21 router1 pengine: [4724]: info: determine_online_status: > >> Node router1 is online > >> Dec 26 05:00:21 router1 pengine: [4724]: info: unpack_rsc_op: > >> failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expect > >> ed value: 7 (not running) > >> Dec 26 05:00:21 router1 pengine: [4724]: WARN: unpack_rsc_op: Operation > >> failover-gw_monitor_0 found resource failover-gw active on r > >> outer1 > >> Dec 26 05:00:21 router1 pengine: [4724]: info: determine_online_status: > >> Node router2 is online > >> Dec 26 05:00:21 router1 pengine: [4724]: info: unpack_rsc_op: > >> failover-gw_monitor_0 on router2 returned 5 (not installed) instead of > >> the expected value: 7 (not running) > >> Dec 26 05:00:21 router1 pengine: [4724]: ERROR: unpack_rsc_op: Hard > >> error - failover-gw_monitor_0 failed with rc=5: Preventing failover-gw > >> from re-starting on router2 > > > > Hi, > > > > there must be other log entries. In the Router RA I have before err out > > the agent write reasons into the ocf_log(). What version of pacemaker and > > cluster- glue do you have? What distribution you a running on? > > > > Greetings, > > I've checked all my logs. Syslog logs everything to my messages logfile, > so it should be there if anywhere. > > I'm running OpenSUSE 11.2 which comes with heartbeat 2.99.3, pacemaker > 1.0.1, openais 0.80.3, as to what all's running in this setup.
Hm. This is already a quite old verison of pacemaker. But it should run anyway. Please could you check the resource manually on router1. export OCF_ROOT=/usr/lib/ocf export OCF_RESKEY_destination="0.0.0.0/0" export OCF_RESKEY_gateway="24.227.124.157" /usr/lib/ocf/resource.d/heartbeat/Route monitor; echo $? should reult in 0 (started) or 7 (not started) /usr/lib/ocf/resource.d/heartbeat/Route start; echo $? should add the default route and result in 0 /usr/lib/ocf/resource.d/heartbeat/Route monitor; echo $? should result in 0 (started) /usr/lib/ocf/resource.d/heartbeat/Route stop; echo $? should delete the default route and result in 0 /usr/lib/ocf/resource.d/heartbeat/Route monitor; echo $? should result in 7 (not started) If this works not as expected, are the any error message? Please see if you can debug the Route script. Greetings, -- Dr. Michael Schwartzkopff MultiNET Services GmbH Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany Tel: +49 - 89 - 45 69 11 0 Fax: +49 - 89 - 45 69 11 21 mob: +49 - 174 - 343 28 75 mail: mi...@multinet.de web: www.multinet.de Sitz der Gesellschaft: 85630 Grasbrunn Registergericht: Amtsgericht München HRB 114375 Geschäftsführer: Günter Jurgeneit, Hubert Martens --- PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B Skype: misch42 _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker