Re: [Pacemaker] Upgrading to Pacemaker 1.1.7. Issue: sticky resources failing back after reboot

David Vossel Thu, 06 Sep 2012 08:23:10 -0700

----- Original Message -----
> From: "Parshvi" <parshvi...@gmail.com>
> To: pacema...@clusterlabs.org
> Sent: Thursday, September 6, 2012 2:39:10 AM
> Subject: [Pacemaker] Upgrading to Pacemaker 1.1.7. Issue: sticky resources    
> failing back after reboot
> 
> Hi,
> We have upgraded pacemaker version 1.0.12 to 1.1.7
> The upgrade was done since resources failed to recover after a
> timeout
> (monitor|stop[unmanaged]) and logs observed are:
> 
> WARN: print_graph: Synapse 6 is pending (priority: 0)
> Sep 03 16:55:18 CSS-FU-2 crmd: [25200]: WARN: print_elem: [Action
> 103]: Pending
> (id: SnmpAgent_monitor_5000, loc: CSS-FU-2, priority: 0)
> Sep 03 16:55:18 CSS-FU-2 crmd: [25200]: WARN: print_elem: * [Input
> 102]: Pending
> (id: SnmpAgent_start_0, loc: CSS-FU-2, priority: 0)
> Sep 03 16:55:18 CSS-FU-2 crmd: [25200]: WARN: print_graph: Synapse 7
> is pending
> (priority: 0)
> 
> Reading through the forum mails, it was inferred that this issue is
> fixed in
> 1.1.7
> 
> Platform OS: OEL 5.8
> Pacemaker Version: 1.1.7
> Corosync version: 1.4.3
> 
> Pacemaker and all its dependent packages were built from source
> (tarball:github).
> glib version used for build: 2.32.2
> 
> The following issue is observed in Pacemaker 1.1.7:
> 1) There is a two-node cluster.
> 2) When primary node is rebooted/or pacemaker is restarted, the
> resources fail-
> over to secondary.
> 3) There are 4 group of services.
>    2 group are not sticky.
>    1 group is master/slave multi-state resource
>    1 group is STICKY
> 4) When primary node comes online, even the sticky resources fail
> back to
> primary node (Issue)
> 5) Now, if the secondary node is rebooted, the resources fail over to
> primary
> node.
> 6) Once the secondary node is up, only non-sticky resources
> fail-back. Sticky
> resources remain on primary node.
> 
> 7) Even if Location preference of sticky resources is set for
> Node-2(the
> secondary node), still sticky resources fail-back on Node-1.
> 
> We're using pacemaker 1.0.12 on Production. We're facing issues of
> IPaddr and
> other resources monitor operation timing out and pacemaker not
> recovering from
> it (shared above).
> 
> Any help is welcome.
> 
> PS: Please mention, if any logs or configuration needs to be shared.


My guess is that this is an issue with node scores for the resources in 
question.  Stickiness and location constraints work in a similar way.  You 
could really think of resource stickiness as a temporary location constraint on 
a resource that changes depending on what node it is on.

If you have a resource with stickiness enabled and you want the resource to 
stay put, the stickiness score has to out weigh all the location constraints 
for that resource on other nodes.  If you are using colocation constraints, 
this becomes increasingly complicated as a resources per node location score 
could change based on the location of another resource.

For specific advice on your scenario, there is little we can offer without 
seeing your exact configuration.

-- Vossel


> Thanks,
> Parshvi
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Upgrading to Pacemaker 1.1.7. Issue: sticky resources failing back after reboot

Reply via email to