Hello, again. Why you didn't answer me? I so need your help!! -------- Пересылаемое сообщение -------- От кого: Юлия Школьникова <shkolnikova_y...@mail.ru> Кому: pacemaker@oss.clusterlabs.org Дата: Mon 19 Nov 2012 16:37:21 Тема: [Pacemaker] Problem with monitor
Hello, I configure master/slave cluster for postgresql 9.1 based on corosync и pacemaker. I do it using this presentation: http://schedule2012.rmll.info/IMG/pdf/postgresql-9-0-ha.pdf. Resource agent (pgsql-ms) for master/slave postgresql I took from this: https://github.com/roidelapluie/puppet-cluster. My nodes are node1 и node2. My config file of pacemaker: node node1 node node2 primitive DBIP ocf:heartbeat:IPaddr2 \ params nic="eth0" ip="10.76.112.183" cidr_netmask="22" \ op monitor interval="30s" \ meta target-role="Started" is-managed="true" primitive pgsql ocf:inuits:pgsql-ms \ op monitor interval="5s" role="Master" \ op monitor interval="10s" role="Slave" primitive ping ocf:pacemaker:ping \ params host_list="10.76.112.1" \ op monitor interval="10s" timeout="10s" \ op start interval="0" timeout="45s" group PSQL DBIP ms pgsql-ms pgsql \ params pgsqlconfig="/var/lib/pgsql/9.1/data/postgresql.conf" lsb_script="/etc/init.d/postgresql-9.1" pgsqlrecovery="/var/lib/pgsql/9.1/data/recovery.conf" \ meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="true" clone clone-ping ping \ meta globally-unique="false" location connected PSQL \ rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0 colocation ip_psql inf: PSQL pgsql-ms:Master property $id="cib-bootstrap-options" \ dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ default-resource-stickiness="INFINITY" \ last-lrm-refresh="1352470332" rsc_defaults $id="rsc_defaults-options" \ migration-threshold="INFINITY" \ failure-timeout="10" \ resource-stickiness="INFINITY" Then I try to test my cluster: 1) If I switch off the master, then the slave becomes a new master as expected. This works fine and can be repeated many times 2) But if I try to stop postgresql (to simulate a failure of postgresql) with command: service postgresql-9.1 stop, the following occurs: Given node1 is master, node2 is slave. On the node1 I run "service postgresql-9.1 stop" and the node2 becomes the master. Now, on the node2 I run "service postgresql-9.1 stop" and the node1 becomes the master again. At this time a monitoring of my resource on node1 stops, and the following entry appears in the log: node1 crmd[1362]: info: process_lrm_event: LRM operation pgsql:0_monitor_10000 (call=33, status=1, cib-update=0, confirmed=true) Cancelled Now if I run "service postgresql-9.1 stop" on the node1, pacemaker doesn't see that postgresql have stopped and doesn't try to restart it and promote node2 to master. If I run "crm resource reprobe" montor action resumes to work. I can not understand why the operation monitor stops working. Please, help me. Shkolnikova Yulia. ----------------------------------------------------------------------
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org