[Pacemaker] Issues with constraints - working for start/stop, being ignored on "failures"

Cnut Jansen Sun, 30 May 2010 20:22:43 -0700

Hi,

I'm not sure if it's really some kind of bug (maybe allready widelyknown and even allready fixed in more recent versions) or simplymisconfiguration and lack of knowledge and experience or something(since I'm still quite new to HA-computing), but I have issues withPacemaker about the order constraints I defined, can't get rid of themand only partially "work around" them. But such workarounds don't reallyseem "as intended/designed" to me...

The problem is that even though upon starting / switching-to-online andstopping / switching-to-standby the nodes / cluster, all constraintchains work as they should, and so do they even upon directly stoppingthe troubling fundamental resources, the DRBD- and DLM-resources, whichare the bases of my constraint chains. Therefor when i.e. a failureoccurs in the DRBD-resource for MySQL's DataDir, the cluster shouldfirst stop the MySQL-resource-group (MySQL + IP-adress), then stop theMySQL-mount-resource, then demote and finally stop the DRBD-resource.But when trying to test the cluster's behaviour upon such a failure via"crm_resource -F -r drbdMysql:0 -H nde28", the cluster first tries todemote the DRBD-resource, then also allready stop it, then the MySQL-IP,the MySQL-mount and only finally MySQL.The result of such a test isn't - due to failing demote and stop for theDRBD-resource - hard to guess: DRBD-resource left in "started(unmanaged) failed", rest of involved resources is stopped.

I'm running Pacemaker 1.0.6, delivered with and running on SLES 11 withHAE, both kept up-to-date with official update repositories (due tocompany's directives).In a few days SLES 11 SP1 shall be released, where I also hope for amore recent version of Pacemaker, DRBD (still have to run 8.2.7) andother HA-cluster-related stuff.

I also allready posted about these issues in Novell's support forum withlots of more details:

http://forums.novell.com/novell-product-support-forums/suse-linux-enterprise-server-sles/sles-configure-administer/411152-constraint-issues-upon-failure-drbd-resource-suse-linux-enterprise-hae-11-a.html

So I'm wondering:

1) Aren't constraint chains upon defining them also allready implicitlyexactly invertedly defined for stopping resources too?2) After my testing for workarounds: Why (seem to) do - in case of the"failing" fundamental resources - order constraints for MS-resources'sstop-action have an effect, but neither those for MS-resources'sdemote-action, nor those for (primitives's/?)clones's stop-action? Or isthat just for the MS-resources's stop-action being only the secondcommand anyway, and just therefor following my additional constraint?!



Current constraints:
colocation TEST_colocO2cb inf: cloneO2cb cloneDlm
colocation colocGrpMysql inf: grpMysql cloneMountMysql
colocation colocMountMysql_drbd inf: cloneMountMysql msDrbdMysql:Master
colocation colocMountMysql_o2cb inf: cloneMountMysql cloneO2cb

colocation colocMountOpencms_drbd inf: cloneMountOpencmsmsDrbdOpencms:Master

colocation colocMountOpencms_o2cb inf: cloneMountOpencms cloneO2cb
colocation colocTomcat inf: cloneTomcat cloneMountOpencms:Started
order TEST_orderO2cb 0: cloneDlm cloneO2cb
order orderGrpMysql 0: cloneMountMysql:start grpMysql
order orderMountMysql_drbd 0: msDrbdMysql:promote cloneMountMysql:start
order orderMountMysql_o2cb 0: cloneO2cb cloneMountMysql

order orderMountOpencms_drbd 0: msDrbdOpencms:promotecloneMountOpencms:start

order orderMountOpencms_o2cb 0: cloneO2cb cloneMountOpencms
order orderTomcat 0: cloneMountOpencms:start cloneTomcat

Constraints added to "work around" at least the DRBD-resources left instate "started (unmanaged) failed":

order GNAH_orderDrbdMysql_stop 0: cloneMountMysql:stop msDrbdMysql:stop

order GNAH_orderDrbdOpencms_stop 0: cloneMountOpencms:stopmsDrbdOpencms:stop(Also tried similiar constraints for msDrbd*:demote and cloneDlm:stop,but neither seemed to have an effect)



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

[Pacemaker] Issues with constraints - working for start/stop, being ignored on "failures"

Reply via email to