pgsql OCF RA doesn't support multistate configuration so I don't think that creating a clone would be a good idea.
On Feb 8, 2008 2:43 PM, Zoltan Boszormenyi <[EMAIL PROTECTED]> wrote: > Hi, > > what is the most bulletproof way to set up a two-node cluster > so both run a PostgreSQL instance? > Is it better to create two pgsql resources both bound to > a certain machine or a cloned pgsql resource? > I would need it for replication, either with a full active/active > system with SkyTools' Londiste or an active passive > replication with SkyTools' WALMgr solution. > Occasionally I see "multirunning" state for my resources > no matter which way I choose. Although it's triggerable > more easily with the cloned resource. > > Also, I have set up a virtual IP with IPaddr OCF RA. > Takeover takes place instantly if I put the "master" machine > into standby mode or e.g. stop PostgreSQL manually > with its SysV script as I have specified a monitor action > for it with 2 seconds interval and fencing on failure. > But I noticed that somehow IP takeover doesn't take place > if I pull the plug on the virtual ethernet card(s). > I tried it with two Fedora 6 systems inside VMWare. > I have set up pingd and my host machine as the ping node > and services stop on the machine separated from the network, > the virtual IP isn't started on the still connected machine. > > Attached are my ha.cf, cib.xml and the referenced extra scripts. > > Thanks in advance and best regards, > Zoltán Böszörményi > > -- > ---------------------------------- > Zoltán Böszörményi > Cybertec Schönig & Schönig GmbH > http://www.postgresql.at/ > > > # > # There are lots of options in this file. All you have to have is a set > # of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast}, > # and a value for "auto_failback". > # > # ATTENTION: As the configuration file is read line by line, > # THE ORDER OF DIRECTIVE MATTERS! > # > # In particular, make sure that the udpport, serial baud rate > # etc. are set before the heartbeat media are defined! > # debug and log file directives go into effect when they > # are encountered. > # > # All will be fine if you keep them ordered as in this example. > # > # > # Note on logging: > # If any of debugfile, logfile and logfacility are defined then they > # will be used. If debugfile and/or logfile are not defined and > # logfacility is defined then the respective logging and debug > # messages will be loged to syslog. If logfacility is not defined > # then debugfile and logfile will be used to log messges. If > # logfacility is not defined and debugfile and/or logfile are not > # defined then defaults will be used for debugfile and logfile as > # required and messages will be sent there. > # > # File to write debug messages to > #debugfile /var/log/ha-debug > # > # > # File to write other messages to > # > logfile /var/log/ha-log > # > # > # Facility to use for syslog()/logger > # > #logfacility local0 > # > # > # A note on specifying "how long" times below... > # > # The default time unit is seconds > # 10 means ten seconds > # > # You can also specify them in milliseconds > # 1500ms means 1.5 seconds > # > # > # keepalive: how long between heartbeats? > # > #keepalive 2 > # > # deadtime: how long-to-declare-host-dead? > # > # If you set this too low you will get the problematic > # split-brain (or cluster partition) problem. > # See the FAQ for how to use warntime to tune deadtime. > # > #deadtime 30 > # > # warntime: how long before issuing "late heartbeat" warning? > # See the FAQ for how to use warntime to tune deadtime. > # > #warntime 10 > # > # > # Very first dead time (initdead) > # > # On some machines/OSes, etc. the network takes a while to come up > # and start working right after you've been rebooted. As a result > # we have a separate dead time for when things first come up. > # It should be at least twice the normal dead time. > # > #initdead 120 > # > # > # What UDP port to use for bcast/ucast communication? > # > #udpport 694 > # > # Baud rate for serial ports... > # > #baud 19200 > # > # serial serialportname ... > #serial /dev/ttyS0 # Linux > #serial /dev/cuaa0 # FreeBSD > #serial /dev/cuad0 # FreeBSD 6.x > #serial /dev/cua/a # Solaris > # > # > # What interfaces to broadcast heartbeats over? > # > #bcast eth0 # Linux > #bcast eth1 eth2 # Linux > #bcast le0 # Solaris > #bcast le1 le2 # Solaris > # > # Set up a multicast heartbeat medium > # mcast [dev] [mcast group] [port] [ttl] [loop] > # > # [dev] device to send/rcv heartbeats on > # [mcast group] multicast group to join (class D multicast address > # 224.0.0.0 - 239.255.255.255) > # [port] udp port to sendto/rcvfrom (set this value to the > # same value as "udpport" above) > # [ttl] the ttl value for outbound heartbeats. this effects > # how far the multicast packet will propagate. (0-255) > # Must be greater than zero. > # [loop] toggles loopback for outbound multicast heartbeats. > # if enabled, an outbound packet will be looped back and > # received by the interface it was sent on. (0 or 1) > # Set this value to zero. > # > # > #mcast eth0 225.0.0.1 694 1 0 > # > # Set up a unicast / udp heartbeat medium > # ucast [dev] [peer-ip-addr] > # > # [dev] device to send/rcv heartbeats on > # [peer-ip-addr] IP address of peer to send packets to > # > #ucast eth0 192.168.1.2 > # > # > # About boolean values... > # > # Any of the following case-insensitive values will work for true: > # true, on, yes, y, 1 > # Any of the following case-insensitive values will work for false: > # false, off, no, n, 0 > # > # > # > # auto_failback: determines whether a resource will > # automatically fail back to its "primary" node, or remain > # on whatever node is serving it until that node fails, or > # an administrator intervenes. > # > # The possible values for auto_failback are: > # on - enable automatic failbacks > # off - disable automatic failbacks > # legacy - enable automatic failbacks in systems > # where all nodes do not yet support > # the auto_failback option. > # > # auto_failback "on" and "off" are backwards compatible with the old > # "nice_failback on" setting. > # > # See the FAQ for information on how to convert > # from "legacy" to "on" without a flash cut. > # (i.e., using a "rolling upgrade" process) > # > # The default value for auto_failback is "legacy", which > # will issue a warning at startup. So, make sure you put > # an auto_failback directive in your ha.cf file. > # (note: auto_failback can be any boolean or "legacy") > # > #auto_failback on > # > # > # Basic STONITH support > # Using this directive assumes that there is one stonith > # device in the cluster. Parameters to this device are > # read from a configuration file. The format of this line is: > # > # stonith <stonith_type> <configfile> > # > # NOTE: it is up to you to maintain this file on each node in the > # cluster! > # > #stonith baytech /etc/ha.d/conf/stonith.baytech > # > # STONITH support > # You can configure multiple stonith devices using this directive. > # The format of the line is: > # stonith_host <hostfrom> <stonith_type> <params...> > # <hostfrom> is the machine the stonith device is attached > # to or * to mean it is accessible from any host. > # <stonith_type> is the type of stonith device (a list of > # supported drives is in /usr/lib/stonith.) > # <params...> are driver specific parameters. To see the > # format for a particular device, run: > # stonith -l -t <stonith_type> > # > # > # Note that if you put your stonith device access information in > # here, and you make this file publically readable, you're asking > # for a denial of service attack ;-) > # > # To get a list of supported stonith devices, run > # stonith -L > # For detailed information on which stonith devices are supported > # and their detailed configuration options, run this command: > # stonith -h > # > #stonith_host * baytech 10.0.0.3 mylogin mysecretpassword > #stonith_host ken3 rps10 /dev/ttyS1 kathy 0 > #stonith_host kathy rps10 /dev/ttyS1 ken3 0 > # > # Watchdog is the watchdog timer. If our own heart doesn't beat for > # a minute, then our machine will reboot. > # NOTE: If you are using the software watchdog, you very likely > # wish to load the module with the parameter "nowayout=0" or > # compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even > # an orderly shutdown of heartbeat will trigger a reboot, which is > # very likely NOT what you want. > # > #watchdog /dev/watchdog > # > # Tell what machines are in the cluster > # node nodename ... -- must match uname -n > #node ken3 > #node kathy > # > # Less common options... > # > # Treats 10.10.10.254 as a psuedo-cluster-member > # Used together with ipfail below... > # note: don't use a cluster node as ping node > # > #ping 10.10.10.254 > # > # Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster-member > # called group1. If either 10.10.10.254 or 10.10.10.253 are up > # then group1 is up > # Used together with ipfail below... > # > #ping_group group1 10.10.10.254 10.10.10.253 > # > # HBA ping derective for Fiber Channel > # Treats fc-card-name as psudo-cluster-member > # used with ipfail below ... > # > # You can obtain HBAAPI from http://hbaapi.sourceforge.net. You need > # to get the library specific to your HBA directly from the vender > # To install HBAAPI stuff, all You need to do is to compile the common > # part you obtained from the sourceforge. This will produce libHBAAPI.so > # which you need to copy to /usr/lib. You need also copy hbaapi.h to > # /usr/include. > # > # The fc-card-name is the name obtained from the hbaapitest program > # that is part of the hbaapi package. Running hbaapitest will produce > # a verbose output. One of the first line is similar to: > # Apapter number 0 is named: qlogic-qla2200-0 > # Here fc-card-name is qlogic-qla2200-0. > # > #hbaping fc-card-name > # > # > # Processes started and stopped with heartbeat. Restarted unless > # they exit with rc=100 > # > #respawn userid /path/name/to/run > #respawn hacluster /usr/lib/heartbeat/ipfail > # > # Access control for client api > # default is no access > # > #apiauth client-name gid=gidlist uid=uidlist > #apiauth ipfail gid=haclient uid=hacluster > > ########################### > # > # Unusual options. > # > ########################### > # > # hopfudge maximum hop count minus number of nodes in config > #hopfudge 1 > # > # deadping - dead time for ping nodes > #deadping 30 > # > # hbgenmethod - Heartbeat generation number creation method > # Normally these are stored on disk and incremented as needed. > #hbgenmethod time > # > # realtime - enable/disable realtime execution (high priority, etc.) > # defaults to on > #realtime off > # > # debug - set debug level > # defaults to zero > debug 1 > # > # API Authentication - replaces the fifo-permissions-based system of > the past > # > # > # You can put a uid list and/or a gid list. > # If you put both, then a process is authorized if it qualifies under > either > # the uid list, or under the gid list. > # > # The groupname "default" has special meaning. If it is specified, then > # this will be used for authorizing groupless clients, and any client > groups > # not otherwise specified. > # > # There is a subtle exception to this. "default" will never be used in > the > # following cases (actual default auth directives noted in brackets) > # ipfail (uid=HA_CCMUSER) > # ccm (uid=HA_CCMUSER) > # ping (gid=HA_APIGROUP) > # cl_status (gid=HA_APIGROUP) > # > # This is done to avoid creating a gaping security hole and matches the > most > # likely desired configuration. > # > #apiauth ipfail uid=hacluster > #apiauth ccm uid=hacluster > #apiauth cms uid=hacluster > #apiauth ping gid=haclient uid=alanr,root > #apiauth default gid=haclient > > # message format in the wire, it can be classic or netstring, > # default: classic > #msgfmt classic/netstring > > # Do we use logging daemon? > # If logging daemon is used, logfile/debugfile/logfacility in this file > # are not meaningful any longer. You should check the config file for > logging > # daemon (the default is /etc/logd.cf) > # more infomartion can be fould in > http://www.linux-ha.org/ha_2ecf_2fUseLogdDirective > # Setting use_logd to "yes" is recommended > # > # use_logd yes/no > # > # the interval we reconnect to logging daemon if the previous > connection failed > # default: 60 seconds > #conn_logd_time 60 > # > # > # Configure compression module > # It could be zlib or bz2, depending on whether u have the corresponding > # library in the system. > #compression bz2 > # > # Confiugre compression threshold > # This value determines the threshold to compress a message, > # e.g. if the threshold is 1, then any message with size greater than 1 > KB > # will be compressed, the default is 2 (KB) > #compression_threshold 2 > > node ws232.ltsp ws231.ltsp > #bcast bond0 > #ucast bond0 157.177.2.31 > crm on > ping 192.168.0.1 > #ping 157.177.6.210 > respawn root /usr/lib64/heartbeat/pingd -m 100 -d 5s > #mcast bond0 224.0.0.1 694 1 0 > bcast eth0 > bcast eth1 > > #!/bin/sh > # > # > # OCF RA for monitoring Londiste Replay process > # > # based on: > # > # Dummy OCF RA. Does nothing but wait a few seconds, can be > # configured to fail occassionally. > # > # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée > # All Rights Reserved. > # > # This program is free software; you can redistribute it and/or modify > # it under the terms of version 2 of the GNU General Public License as > # published by the Free Software Foundation. > # > # This program is distributed in the hope that it would be useful, but > # WITHOUT ANY WARRANTY; without even the implied warranty of > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > # > # Further, this software is distributed without any warranty that it is > # free of the rightful claim of any third person regarding infringement > # or the like. Any license provided herein, whether implied or > # otherwise, applies only to this software file. Patent licenses, if > # any, provided herein do not apply to combinations of this program with > # other software, or any other product whatsoever. > # > # You should have received a copy of the GNU General Public License > # along with this program; if not, write the Free Software Foundation, > # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. > # > > ####################################################################### > # Initialization: > > if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ] > then > . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs > else > if [ -f /usr/lib/heartbeat/ocf-shellfuncs ] > then > . /usr/lib/heartbeat/ocf-shellfuncs > else > if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ] > then > . /usr/lib64/heartbeat/ocf-shellfuncs > else > exit $OCF_ERR_CONFIGURED > fi > fi > fi > > ####################################################################### > > meta_data() { > cat <<END > <?xml version="1.0"?> > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> > <resource-agent name="LondisteReplay" version="1.0"> > <version>1.0</version> > > <longdesc lang="en"> > This is a LondisteReplay Resource Agent. It starts/stops/monitors > londiste.py's replay process. > </longdesc> > <shortdesc lang="en">LondisteReplay resource agent</shortdesc> > > <parameters> > <parameter name="configdir" unique="0"> > <longdesc lang="en"> > This where londiste.ini is. > </longdesc> > <shortdesc lang="en">Configuration directory</shortdesc> > <content type="string" default="/etc/cluster" /> > </parameter> > > <parameter name="pidfile" unique="0"> > <longdesc lang="en"> > This the pidfile PGQADM uses to indicate its started state. > </longdesc> > <shortdesc lang="en">Pidfile</shortdesc> > <content type="string" default="/etc/cluster/londiste.pid" /> > </parameter> > > </parameters> > > > <actions> > <action name="start" timeout="90" /> > <action name="stop" timeout="100" /> > <action name="monitor" timeout="20" interval="10" depth="0" > start-delay="0" /> > <action name="meta-data" timeout="5" /> > <action name="verify-all" timeout="30" /> > </actions> > </resource-agent> > END > } > > ####################################################################### > > # don't exit on TERM, to test that lrmd makes sure that we do exit > trap sigterm_handler TERM > sigterm_handler() { > ocf_log info "They use TERM to bring us down. No such luck." > return > } > > dummy_usage() { > cat <<END > usage: $0 {start|stop|monitor|validate-all|meta-data} > > Expects to have a fully populated OCF RA-compliant environment set. > END > } > > dummy_monitor() { > if [ -f $PIDFILE ]; then > PID="$((`cat $PIDFILE`))" > fi > if [ -z $PID ]; then > return $OCF_NOT_RUNNING > fi > PGQPROCFILE="/proc/$PID/cmdline" > if [ ! -f $PGQPROCFILE ]; then > return $OCF_ERR_GENERIC > fi > PGQADM=$((`grep -ia londiste $PGQPROCFILE 2>/dev/null | wc -l`)) > if [ "x$PGQADM" = "x0" ]; then > return $OCF_ERR_GENERIC > fi > return $OCF_SUCCESS > } > > dummy_start() { > dummy_monitor > MONRET=$? > if [ $MONRET = $OCF_SUCCESS ]; then > return $OCF_SUCCESS > fi > if [ $MONRET = $OCF_NOT_RUNNING ]; then > londiste.py $CONFIGDIR/londiste.ini replay & > return $OCF_SUCCESS > fi > return $OCF_ERR_GENERIC > } > > dummy_stop() { > dummy_monitor > if [ $? = $OCF_SUCCESS ]; then > kill -TERM $PID > fi > USLEEP="`which usleep`" > while [ -f $PIDFILE ]; do > if [ -x $USLEEP ]; then > $USLEEP 20 > continue > fi > sleep 1 > done > return $OCF_SUCCESS > } > > dummy_validate() { > exit $OC_ERR_UNIMPLEMENTED > } > > CONFIGDIR=${OCF_RESKEY_configdir:-/etc/cluster} > PIDFILE=${OCF_RESKEY_pidfile:-/etc/cluster/pgqadm.pid} > > case $__OCF_ACTION in > meta-data) meta_data > exit $OCF_SUCCESS > ;; > start) dummy_start > ;; > stop) dummy_stop > ;; > monitor) dummy_monitor > ;; > validate-all) dummy_validate;; > usage|help) dummy_usage > exit $OCF_SUCCESS > ;; > *) dummy_usage > exit $OCF_ERR_UNIMPLEMENTED > ;; > esac > rc=$? > ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc" > exit $rc > > > #!/bin/sh > # > # > # OCF RA for monitoring Londiste Ticker process > # > # based on: > # > # Dummy OCF RA. Does nothing but wait a few seconds, can be > # configured to fail occassionally. > # > # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée > # All Rights Reserved. > # > # This program is free software; you can redistribute it and/or modify > # it under the terms of version 2 of the GNU General Public License as > # published by the Free Software Foundation. > # > # This program is distributed in the hope that it would be useful, but > # WITHOUT ANY WARRANTY; without even the implied warranty of > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > # > # Further, this software is distributed without any warranty that it is > # free of the rightful claim of any third person regarding infringement > # or the like. Any license provided herein, whether implied or > # otherwise, applies only to this software file. Patent licenses, if > # any, provided herein do not apply to combinations of this program with > # other software, or any other product whatsoever. > # > # You should have received a copy of the GNU General Public License > # along with this program; if not, write the Free Software Foundation, > # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. > # > > ####################################################################### > # Initialization: > > if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ] > then > . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs > else > if [ -f /usr/lib/heartbeat/ocf-shellfuncs ] > then > . /usr/lib/heartbeat/ocf-shellfuncs > else > if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ] > then > . /usr/lib64/heartbeat/ocf-shellfuncs > else > exit $OCF_ERR_CONFIGURED > fi > fi > fi > > ####################################################################### > > meta_data() { > cat <<END > <?xml version="1.0"?> > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> > <resource-agent name="LondisteTicker" version="0.9"> > <version>1.0</version> > > <longdesc lang="en"> > This is a LondisteTicker Resource Agent. It starts/stops/monitors > pgqadm.py's ticker process. > </longdesc> > <shortdesc lang="en">LondisteTicker resource agent</shortdesc> > > <parameters> > <parameter name="configdir" unique="0"> > <longdesc lang="en"> > This where pgqadm.ini is. > </longdesc> > <shortdesc lang="en">Configuration directory</shortdesc> > <content type="string" default="/etc/cluster" /> > </parameter> > > <parameter name="pidfile" unique="0"> > <longdesc lang="en"> > This the pidfile PGQADM uses to indicate its started state. > </longdesc> > <shortdesc lang="en">Pidfile</shortdesc> > <content type="string" default="/etc/cluster/pgqadm.pid" /> > </parameter> > > </parameters> > > <actions> > <action name="start" timeout="90" /> > <action name="stop" timeout="100" /> > <action name="monitor" timeout="20" interval="10" depth="0" > start-delay="5" /> > <action name="meta-data" timeout="5" /> > <action name="verify-all" timeout="30" /> > </actions> > </resource-agent> > END > } > > ####################################################################### > > # don't exit on TERM, to test that lrmd makes sure that we do exit > trap sigterm_handler TERM > sigterm_handler() { > ocf_log info "They use TERM to bring us down. No such luck." > return > } > > dummy_usage() { > cat <<END > usage: $0 {start|stop|monitor|validate-all|meta-data} > > Expects to have a fully populated OCF RA-compliant environment set. > END > } > > dummy_validate() { > exit $OC_ERR_UNIMPLEMENTED > } > > dummy_monitor() { > if [ -f $PIDFILE ]; then > PID="$((`cat $PIDFILE`))" > fi > if [ -z $PID ]; then > return $OCF_NOT_RUNNING > fi > PGQPROCFILE="/proc/$PID/cmdline" > if [ ! -f $PGQPROCFILE ]; then > return $OCF_ERR_GENERIC > fi > PGQADM=$((`grep -ia pgqadm $PGQPROCFILE 2>/dev/null | wc -l`)) > if [ "x$PGQADM" = "x0" ]; then > return $OCF_ERR_GENERIC > fi > return $OCF_SUCCESS > } > > dummy_start() { > dummy_monitor > MONRET=$? > if [ $MONRET = $OCF_SUCCESS ]; then > return $OCF_SUCCESS > fi > if [ $MONRET = $OCF_NOT_RUNNING ]; then > pgqadm.py $CONFIGDIR/pgqadm.ini ticker & > return $OCF_SUCCESS > fi > return $OCF_ERR_GENERIC > } > > dummy_stop() { > dummy_monitor > if [ $? = $OCF_SUCCESS ]; then > kill -TERM $PID > fi > USLEEP="`which usleep`" > SLEEP="`which sleep`" > while [ -f $PIDFILE ]; do > if [ -x $USLEEP ]; then > $USLEEP 20 > continue > fi > sleep 1 > done > return $OCF_SUCCESS > } > > CONFIGDIR=${OCF_RESKEY_configdir:-/etc/cluster} > PIDFILE=${OCF_RESKEY_pidfile:-/etc/cluster/pgqadm.pid} > > case $__OCF_ACTION in > meta-data) meta_data > ;; > start) dummy_start > ;; > stop) dummy_stop > ;; > monitor) dummy_monitor > ;; > validate-all) dummy_validate;; > usage|help) dummy_usage > exit $OCF_SUCCESS > ;; > *) dummy_usage > exit $OCF_ERR_UNIMPLEMENTED > ;; > esac > rc=$? > ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc" > exit $rc > > > #!/bin/sh > # > # > # SlaveMigration OCF RA. Sets up the slave PostgreSQL pg_hba.conf > # when migration to/from lardb04 > # > # based on > # > # Dummy OCF RA. Does nothing but wait a few seconds, can be > # configured to fail occassionally. > # > # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée > # All Rights Reserved. > # > # This program is free software; you can redistribute it and/or modify > # it under the terms of version 2 of the GNU General Public License as > # published by the Free Software Foundation. > # > # This program is distributed in the hope that it would be useful, but > # WITHOUT ANY WARRANTY; without even the implied warranty of > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > # > # Further, this software is distributed without any warranty that it is > # free of the rightful claim of any third person regarding infringement > # or the like. Any license provided herein, whether implied or > # otherwise, applies only to this software file. Patent licenses, if > # any, provided herein do not apply to combinations of this program with > # other software, or any other product whatsoever. > # > # You should have received a copy of the GNU General Public License > # along with this program; if not, write the Free Software Foundation, > # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. > # > > ####################################################################### > # Initialization: > > if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ] > then > . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs > else > if [ -f /usr/lib/heartbeat/ocf-shellfuncs ] > then > . /usr/lib/heartbeat/ocf-shellfuncs > else > if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ] > then > . /usr/lib64/heartbeat/ocf-shellfuncs > else > exit $OCF_ERR_CONFIGURED > fi > fi > fi > > #. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs > #. /usr/lib64/heartbeat/ocf-shellfuncs > #. /usr/share/ocf/resource.d/heartbeat/.ocf-shellfuncs > > ####################################################################### > > meta_data() { > cat <<END > <?xml version="1.0"?> > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> > <resource-agent name="SlaveMigration" version="0.9"> > <version>1.0</version> > > <longdesc lang="en"> > This is the SlaveMigration Resource Agent. It sets up the slave PostgreSQL > authentication so pgqadm/londiste cannot incidentally start replicating from > the master PostgreSQL server. > </longdesc> > <shortdesc lang="en">SlaveMigration resource agent</shortdesc> > > <parameters> > > <parameter name="masterip" unique="0" required="1"> > <longdesc lang="en"> > This is the space-separated list of IPs the master server lives at. > </longdesc> > <shortdesc lang="en">IP addresses of the master </shortdesc> > <content type="string" default="" /> > </parameter> > > <parameter name="masterhostname" unique="0" required="0"> > <longdesc lang="en"> > This is the short form of the hostname of the master server > </longdesc> > <shortdesc lang="en">Master short hostname</shortdesc> > <content type="string" default="" /> > </parameter> > > <parameter name="slavehostname" unique="0" required="1"> > <longdesc lang="en"> > This is the short form of the hostname of the slave server > </longdesc> > <shortdesc lang="en">Slave short hostname</shortdesc> > <content type="string" default="" /> > </parameter> > > <parameter name="psql" unique="0" required="0"> > <longdesc lang="en"> > Path to psql command. > </longdesc> > <shortdesc lang="en">psql</shortdesc> > <content type="string" default="/usr/bin/psql" /> > </parameter> > > <parameter name="pgport" unique="0"> > <longdesc lang="en"> > This is post PostgreSQL listens on. > </longdesc> > <shortdesc lang="en">PostgreSQL service port</shortdesc> > <content type="string" default="" /> > </parameter> > > <parameter name="pgdata" unique="0" required="1"> > <longdesc lang="en"> > Path to PostgreSQL data directory. > </longdesc> > <shortdesc lang="en">pgdata</shortdesc> > <content type="string" default="/var/lib/pgsql/data" /> > </parameter> > > <parameter name="pghba_ok" unique="0" required="1"> > <longdesc lang="en"> > Path to normal pg_hba.conf > </longdesc> > <shortdesc lang="en">pghba_ok</shortdesc> > <content type="string" default="" /> > </parameter> > > <parameter name="pghba_failed" unique="0" required="1"> > <longdesc lang="en"> > Path to pg_hba.conf for fenced case to disallow connection of the master > server. > </longdesc> > <shortdesc lang="en">pghba_failed</shortdesc> > <content type="string" default="" /> > </parameter> > > </parameters> > > <actions> > <action name="start" timeout="90" /> > <action name="stop" timeout="100" /> > <action name="monitor" timeout="20" interval="10" depth="0" > start-delay="1" /> > <action name="reload" timeout="90" /> > <action name="migrate_to" timeout="100" /> > <action name="migrate_from" timeout="90" /> > <action name="meta-data" timeout="5" /> > <action name="verify-all" timeout="30" /> > </actions> > </resource-agent> > END > } > > ####################################################################### > > # don't exit on TERM, to test that lrmd makes sure that we do exit > trap sigterm_handler TERM > sigterm_handler() { > ocf_log info "They use TERM to bring us down. No such luck." > return > } > > dummy_usage() { > cat <<END > usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all|meta-data} > > Expects to have a fully populated OCF RA-compliant environment set. > END > } > > dummy_validate() { > return $OC_ERR_UNIMPLEMENTED > } > > dummy_monitor() { > if [ -f /var/lock/subsys/SlaveMigration ] > then > return $OCF_SUCCESS > fi > return $OCF_NOT_RUNNING > } > > > slave_start() { > dummy_monitor > if [ $? = $OCF_SUCCESS ]; then > return $OCF_SUCCESS > fi > > touch /var/lock/subsys/SlaveMigration > > check_for_slavehostname > check_for_masterip > > if [ "`hostname -s`" = "$OCF_RESKEY_slavehostname" ] > then > check_for_pg_hbas > ln -sf "$OCF_RESKEY_pghba_failed" > "${OCF_RESKEY_pgdata}/pg_hba.conf" > reload_pg_conf > slave_kill_pg_from_master > fi > > return $OCF_SUCCESS > } > > slave_stop() { > dummy_monitor > if [ $? = $OCF_NOT_RUNNING ]; then > return $OCF_SUCCESS > fi > > check_for_slavehostname > > if [ "`hostname -s`" = "$OCF_RESKEY_slavehostname" ] > then > check_for_pg_hbas > ln -sf "$OCF_RESKEY_pghba_ok" > "${OCF_RESKEY_pgdata}/pg_hba.conf" > reload_pg_conf > fi > > rm -f /var/lock/subsys/SlaveMigration > return $OCF_SUCCESS > } > > slave_kill_pg_from_master() { > PGPARAM="-A -t -U postgres -h localhost" > if [ "$OCF_RESKEY_pgport" != "" ] > then > PGPARAM="$PGPARAM -p $OCF_RESKEY_pgport" > fi > for ip in $OCF_RESKEY_masterip ; do > # echo -- $PGPARAM -c "\"select procpid from pg_stat_activity > where client_addr='"${ip}"'\"" > $OCF_RESKEY_psql $PGPARAM -c "select procpid from > pg_stat_activity where client_addr='"${ip}"'" | \ > while read pid ; do > kill -TERM $pid > done > done > } > > reload_pg_conf() { > PGPARAM="-U postgres -h localhost" > if [ "$OCF_RESKEY_pgport" != "" ] > then > PGPARAM="$PGPARAM -p $OCF_RESKEY_pgport" > fi > $OCF_RESKEY_psql $PGPARAM -c "select pg_reload_conf()" 1>/dev/null > 2>/dev/null > } > > check_for_slavehostname() { > if [ "$OCF_RESKEY_slavehostname" = "" ] > then > ocf_log debug "${OCF_RESOURCE_INSTANCE} No slave hostname > given" > exit $OCF_ERR_GENERIC > fi > } > > check_for_masterhostname() { > if [ "$OCF_RESKEY_masterhostname" = "" ] > then > ocf_log debug "${OCF_RESOURCE_INSTANCE} No master hostname > given" > exit $OCF_ERR_GENERIC > fi > } > > check_for_masterip() { > if [ "$OCF_RESKEY_masterip" = "" ] > then > ocf_log debug "${OCF_RESOURCE_INSTANCE} No master IP given" > exit $OCF_ERR_GENERIC > fi > } > > check_for_pg_hbas() { > if [ "$OCF_RESKEY_pghba_ok" = "" ] > then > echo OCF_RESKEY_pghba_ok > ocf_log debug "${OCF_RESOURCE_INSTANCE} No normal pg_hba.conf > given" > exit $OCF_ERR_GENERIC > fi > if [ ! -f "$OCF_RESKEY_pghba_ok" ] > then > echo -- -f OCF_RESKEY_pghba_ok > ocf_log debug "${OCF_RESOURCE_INSTANCE} pg_hba.conf file for > normal operation not exists" > exit $OCF_ERR_GENERIC > fi > if [ "$OCF_RESKEY_pghba_failed" = "" ] > then > echo OCF_RESKEY_pghba_failed > ocf_log debug "${OCF_RESOURCE_INSTANCE} No failed pg_hba.conf > given" > exit $OCF_ERR_GENERIC > fi > if [ ! -f "$OCF_RESKEY_pghba_failed" ] > then > echo -- -f OCF_RESKEY_pghba_failed > ocf_log debug "${OCF_RESOURCE_INSTANCE} pg_hba.conf file for > fenced operation not exists" > exit $OCF_ERR_GENERIC > fi > if [ ! -d "$OCF_RESKEY_pgdata" ] > then > echo -- -d OCF_RESKEY_pgdata > ocf_log debug "${OCF_RESOURCE_INSTANCE} PGDATA directory not > exists" > exit $OCF_ERR_GENERIC > fi > } > > : ${OCF_RESKEY_psql=/usr/bin/psql} > > case $__OCF_ACTION in > meta-data) meta_data > exit $OCF_SUCCESS > ;; > start) slave_start > ;; > stop) slave_stop > ;; > monitor) dummy_monitor > ;; > migrate_to) ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to > ${OCF_RESKEY_CRM_meta_migrate_to}." > slave_stop > ;; > migrate_from) ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE} to > ${OCF_RESKEY_CRM_meta_migrated_from}." > slave_start > ;; > reload) ocf_log err "Reloading..." > dummy_start > ;; > validate-all) dummy_validate;; > usage|help) dummy_usage > exit $OCF_SUCCESS > ;; > *) dummy_usage > exit $OCF_ERR_UNIMPLEMENTED > ;; > esac > rc=$? > ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc" > exit $rc > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Serge Dubrouski. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
