Re: [Pacemaker] Error starting Apache on 2 nodes cluster

Luke Bigum Wed, 18 Nov 2009 16:41:44 -0800

Angie,

I can't tell exactly what's you've provided, can you post your CRM 
configuration (the output of 'crm configure show')? While you're at it, also 
provide ' crm_verify -LV' and 'crm_mon -fo1'.


This looks suspicious though:

Nov 19 01:25:08 test2 crmd: [24251]: info: process_lrm_event: LRM operation 
WebServer_monitor_60000 (call=483, rc=-2, cib-update=0, confirmed=true) 
Cancelled unknown exec error

Personally I'd start with the OCF RA and leave LSB:httpd alone. From the above 
error message, something inside lssb:httpd is returning -2, which is not a 
supported return code.

Depending on how confident you are with shell scripts, you might find it 
helpful to eliminate Pacemaker from the equation and call the Resource Agent 
script yourself to debug problems manually, like so...

Disable your resource so Pacemaker doesn't interfere:

crm_resource -r WebSite -m -p target-role -v stopped

Then move into the RA directory and set a necessary environment variable:

cd =/usr/lib/ocf/resource.d/heartbeat
export OCF_ROOT=/usr/lib/ocf

Start testing the apache RA, setting the only mandatory environment variable 
for ocf:heartbeat:apache :

export OCF_RESKEY_configfile=/path/to/your/main/apache/config
./apache start
echo $?

That should echo "0" for success. Judging by your logs, you can start Apache 
but the monitor is failing:

./apache monitor
echo $?

If that doesn't echo "0", you might get a helpful error message explaining 
what's wrong. You might have to read through the apache script itself to figure 
out why it's failing. Finally test the 'stop' operation:

./apache stop
echo $?

Should echo "0" as well. If this all works for you, but the resource in 
Pacemaker is still not working, then it's probably something in your CIB (like 
a bad attribute), as you've just done pretty much exactly what Pacemaker will 
do.

Let us know how you go.

Luke Bigum
Systems Administrator
 (p) 1300 661 668
 (f)  1300 661 540
(e)  lbi...@iseek.com.au<mailto:lbi...@iseek.com.au>
http://www.iseek.com.au<http://www.iseek.com.au/>
Level 1, 100 Ipswich Road Woolloongabba QLD 4102

[cid:image001.jpg@01CA6901.D25D3CD0]

This e-mail and any files transmitted with it may contain confidential and 
privileged material for the sole use of the intended recipient. Any review, 
use, distribution or disclosure by others is strictly prohibited. If you are 
not the intended recipient (or authorised to receive for the recipient), please 
contact the sender by reply e-mail and delete all copies of this message.


From: Angie T. Muhammad [mailto:angie.taw...@gmail.com]
Sent: Thursday 19 November 2009 9:57 AM
To: pacemaker@oss.clusterlabs.org
Subject: [Pacemaker] Error starting Apache on 2 nodes cluster

Hello
I'm a pacemaker and openais beginner.
I followed the document 'cluster from scratch' and I successfully managed to 
create and monitor a 'ClusterIP' and 'LoadBalancer' resources.

But, Whenever I try to start Apache:
# crm configure primitive WebSite ocf:heartbeat:apache params 
configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min

whether using (ocf:heartbeat:apache) or (lsb::httpd) I get the following errors 
when watching crm_mon:

============
Last updated: Thu Nov 19 01:38:33 2009
Stack: openais
Current DC: test1.localdomain - partition with quorum
Version: 1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7
2 Nodes configured, 2 expected votes
3 Resources configured.
============

Online: [ test1.localdomain test2.localdomain ]

ClusterIP       (ocf::heartbeat:IPaddr2):       Started test1.localdomain
LoadBalancer    (lsb:haproxy):  Started test1.localdomain

Failed actions:
    WebSite_start_0 (node=test1.localdomain, call=9, rc=1, status=complete): 
unknown error
    WebSite_start_0 (node=test2.localdomain, call=5, rc=1, status=complete): 
unknown error
/************************************************************************************************************/

Knowing that I am using:
CentOS 5.4..
openais-0.80.5-15.1
pacemaker-1.0.5-4.1
# chkconfig httpd off
server-status is not enabled in my httpd.conf ...

I always check apache processes before configuring my crm using:

# ps aux | grep httpd
/* to make sure there are no zombie processes */

# /etc/init.d/httpd status
/* to gurantee it's stopped and nothing is locked */

Last but not least I am ataching the last 100 lines of my /var/log/messages of 
the 2nd node to help you help me.
I have been on this loop for four days now and I have no idea why the crm can't 
start apache though when manually starting it, everything runs smoothly!!!

Thank you in advance
--
All the best,
Angie

<<inline: image001.jpg>>

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Re: [Pacemaker] Error starting Apache on 2 nodes cluster

Reply via email to