On 01.10.11 04:53, Serge Dubrouski wrote: > Technically, I don't want the cluster to control the service in the > meaning of starting and stopping. The cluster controls the IP addresses > and moves them between nodes. The dns service resource is supposed to > provide a check that the dns service is working on the node and migrate > the service and most important the IP address if it becomes > unresponsive. > > I didn't look at the concept of clones, yet. Maybe I took a completely > wrong approach to what I am trying to do. > > > I think that clones is rally good solution for this situation. You can > configure BIND as a clone service with different configuration though. > One node will be master another slave. You can also have a floating VIP > tied up to any of the nodes but collocated with the running BIND.If BIND > dies for some reason, pacemaker will move your IP to the survived node. > You can addsending additional alarms.
Thanks a lot! Just learned a couple of things. I have removed my own script. Installed yours and set it up. Configured a clone. primitive bind ocf:heartbeat:named ... clone bind-clone bind Then bind is kept running on all nodes and is only shutdown if it fails. If necessary named is restarted. Great. Then I colocate my ip resources with the clone: colocation ns1-ip-bind inf: nsi1-ip bind-clone colocation ns2-ip-bind inf: nsi2-ip bind-clone Thus the service IP addresses only run on nodes where bind is active. If bind fails on a node the ip address is moved. Two notes (regarding the latest version on github): 1. You expect rndc and host to be in $PATH. At the same time the path to named can be configured. I think consequently, the same should apply to rndc and host as they are bind utils. On our CentOS servers we run the latest version of bind, compiled from source and installed in a custom path which is added in /etc/profile. For some reason /etc/profile doesn't seem to apply to the ocf scripts thus the script doesn't find rndc or host unless I extend PATH manually at the beginning of the script. 2. In the stop function you call "rndc stop" to stop the daemon. However, if the daemon hangs, rndc will hang. Thus pacemaker runs into a timeout and kills the ocf script, leading to a failed stop. I think the ocf script should have its own timeout and abort the rndc call if it takes too long and then try to kill the server. To test send a STOP signal to named and wait... But otherwise, great script. Thanks! Gerald _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker