Hi,

I'm not sure if it's really some kind of bug (maybe allready widely known and even allready fixed in more recent versions) or simply misconfiguration and lack of knowledge and experience or something (since I'm still quite new to HA-computing), but I have issues with Pacemaker about the order constraints I defined, can't get rid of them and only partially "work around" them. But such workarounds don't really seem "as intended/designed" to me...

The problem is that even though upon starting / switching-to-online and stopping / switching-to-standby the nodes / cluster, all constraint chains work as they should, and so do they even upon directly stopping the troubling fundamental resources, the DRBD- and DLM-resources, which are the bases of my constraint chains. Therefor when i.e. a failure occurs in the DRBD-resource for MySQL's DataDir, the cluster should first stop the MySQL-resource-group (MySQL + IP-adress), then stop the MySQL-mount-resource, then demote and finally stop the DRBD-resource. But when trying to test the cluster's behaviour upon such a failure via "crm_resource -F -r drbdMysql:0 -H nde28", the cluster first tries to demote the DRBD-resource, then also allready stop it, then the MySQL-IP, the MySQL-mount and only finally MySQL. The result of such a test isn't - due to failing demote and stop for the DRBD-resource - hard to guess: DRBD-resource left in "started (unmanaged) failed", rest of involved resources is stopped.

I'm running Pacemaker 1.0.6, delivered with and running on SLES 11 with HAE, both kept up-to-date with official update repositories (due to company's directives). In a few days SLES 11 SP1 shall be released, where I also hope for a more recent version of Pacemaker, DRBD (still have to run 8.2.7) and other HA-cluster-related stuff.

I also allready posted about these issues in Novell's support forum with lots of more details:
http://forums.novell.com/novell-product-support-forums/suse-linux-enterprise-server-sles/sles-configure-administer/411152-constraint-issues-upon-failure-drbd-resource-suse-linux-enterprise-hae-11-a.html

So I'm wondering:
1) Aren't constraint chains upon defining them also allready implicitly exactly invertedly defined for stopping resources too? 2) After my testing for workarounds: Why (seem to) do - in case of the "failing" fundamental resources - order constraints for MS-resources's stop-action have an effect, but neither those for MS-resources's demote-action, nor those for (primitives's/?)clones's stop-action? Or is that just for the MS-resources's stop-action being only the second command anyway, and just therefor following my additional constraint?!


Current constraints:
colocation TEST_colocO2cb inf: cloneO2cb cloneDlm
colocation colocGrpMysql inf: grpMysql cloneMountMysql
colocation colocMountMysql_drbd inf: cloneMountMysql msDrbdMysql:Master
colocation colocMountMysql_o2cb inf: cloneMountMysql cloneO2cb
colocation colocMountOpencms_drbd inf: cloneMountOpencms msDrbdOpencms:Master
colocation colocMountOpencms_o2cb inf: cloneMountOpencms cloneO2cb
colocation colocTomcat inf: cloneTomcat cloneMountOpencms:Started
order TEST_orderO2cb 0: cloneDlm cloneO2cb
order orderGrpMysql 0: cloneMountMysql:start grpMysql
order orderMountMysql_drbd 0: msDrbdMysql:promote cloneMountMysql:start
order orderMountMysql_o2cb 0: cloneO2cb cloneMountMysql
order orderMountOpencms_drbd 0: msDrbdOpencms:promote cloneMountOpencms:start
order orderMountOpencms_o2cb 0: cloneO2cb cloneMountOpencms
order orderTomcat 0: cloneMountOpencms:start cloneTomcat

Constraints added to "work around" at least the DRBD-resources left in state "started (unmanaged) failed":
order GNAH_orderDrbdMysql_stop 0: cloneMountMysql:stop msDrbdMysql:stop
order GNAH_orderDrbdOpencms_stop 0: cloneMountOpencms:stop msDrbdOpencms:stop (Also tried similiar constraints for msDrbd*:demote and cloneDlm:stop, but neither seemed to have an effect)


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Reply via email to