----- Original Message ----- > From: "Julien Cornuwel" <cornu...@gmail.com> > To: pacemaker@oss.clusterlabs.org > Sent: Wednesday, July 25, 2012 5:51:28 AM > Subject: Re: [Pacemaker] Trouble with ocf:Squid resource agent > > Oops! Spoke too fast. The fix below allows squid to start. But the > script also has problems in the 'stop' part. It is stuck in an > infinite loop and here are the logs (repeats every second) : > > Jul 25 11:38:47 corsen-a lrmd: [24099]: info: RA output: > (Proxy:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/Squid: line > 320: kill: -: arguments must be process or job IDs > Jul 25 11:38:47 corsen-a lrmd: [24099]: info: RA output: > (Proxy:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/Squid: line > 320: kill: -: arguments must be process or job IDs > Jul 25 11:38:48 corsen-a Squid(Proxy)[24659]: [25682]: INFO: > squid:stop_squid:318: try to stop by SIGKILL: - > Jul 25 11:38:48 corsen-a Squid(Proxy)[24659]: [25682]: INFO: > squid:stop_squid:318: try to stop by SIGKILL: - > > Being on a deadline, I'll use the lsb script for the moment. If > someone figures out how to use this ocf script, I'm very interrested. >
I took a quick look at the OCF... here's the stop section with inline comments from me (###) stop_squid() { typeset lapse_sec if ocf_run $SQUID_EXE -f $SQUID_CONF -k shutdown; then lapse_sec=0 while true; do get_pids if is_squid_dead; then rm -f $SQUID_PIDFILE return $OCF_SUCCESS fi (( lapse_sec = lapse_sec + 1 )) if (( lapse_sec > SQUID_STOP_TIMEOUT )); then ### looks to me like you're hitting the line above which then breaks out and drops down to the "while true" 8 lines down. I would time a manual stop of squid (I know it takes quite awhile) and make sure you're primitive's "op stop interval="0" timeout="120s"" is set high enough (definately more than 120s I would assume) that the elapsed time to stop squid doesn't normally exceed the timeout value. break fi sleep 1 ocf_log info "$SQUID_NAME:$FUNCNAME:$LINENO: " \ "stop NORM $lapse_sec/$SQUID_STOP_TIMEOUT" done fi while true; do get_pids ocf_log info "$SQUID_NAME:$FUNCNAME:$LINENO: " \ "try to stop by SIGKILL:${SQUID_PIDS[0]} ${SQUID_PIDS[2]}" kill -KILL ${SQUID_PIDS[0]} ${SQUID_PIDS[2]} ### have you tried manually running the above line and see what you get (inserting the correct PID's of course)? Maybe the kill -KILL syntax is invalid for your flavor of linux and the OCF needs to be updated to take that into account when running the kill command? Even if you increase the timeout above to a normally reasonable value you still want it to be able to kill it if it is unresponsive! sleep 1 if is_squid_dead; then rm -f $SQUID_PIDFILE return $OCF_SUCCESS fi done return $OCF_ERR_GENERIC } > Regards > > > 2012/7/24 Julien Cornuwel <cornu...@gmail.com>: > > Hi, > > > > Fixed! The problem comes from the squid ocf script > > (/usr/lib/ocf/resource.d/heartbeat/Squid) that doesn't handle IPv6 > > addresses correctly. > > All you have to do is modify the line 198 as such : > > awk '/(tcp.*[0-9]+\.[0-9]+\.+[0-9]+\.[0-9]+:'$SQUID_PORT' > > |tcp.*:::'$SQUID_PORT' )/{ > > > > Source: > > http://www.n3oxid.fr/index.php?post/2012/04/07/Installation-et-configuration-d-un-cluster-Pacemaker/CoroSync-sous-GNU/Linux-Debian-6-%28Squeeze%29 > > Not sure if the above fully patches the OCF for squid ipv4 and ipv6 but I would recommend submitting a patch against the resource agent so in the future it just works ;-) HTH Jake _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org