Hello, I am new to pacemaker and struggling with the somewhat limited documentation. I looked through the archives and didn't find anything that matched my problem. I have brand new pacemaker setup running on CentOS 5.5. I am using the below config file to start up mysql which is also a brand new build. Right now the cluster is only running on one node while I try to isolate this problem. This is a brand new cib file as well. The cluster starts up but then every 30 seconds or so I see it restart mysql. If I stop heartbeat and bring up mysql by itself it starts up just fine. Its driving me batty so I thought I would post it here and see if someone was able to help. What I see in syslog from heartbeat is:
ul 22 15:34:09 sipl-mysql-109 lrmd: [11182]: info: rsc:d_mysql:69: start Jul 22 15:34:11 sipl-mysql-109 lrmd: [11182]: info: RA output: (ip_db:start:stderr) ARPING 10.200.131.9 from 10.200.131.9 eth0 Sent 5 probes (5 broadcast(s)) Received 0 response(s) Jul 22 15:34:13 sipl-mysql-109 mysql[14915]: [15086]: INFO: MySQL started Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: process_lrm_event: LRM operation d_mysql_start_0 (call=69, rc=0, cib-update=105, confirmed=true) ok Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: match_graph_event: Action d_mysql_start_0 (6) confirmed on sipl-mysql-109 (rc=0) Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: te_rsc_command: Initiating action 1: monitor d_mysql_monitor_10000 on sipl-mysql-109 (local) Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: do_lrm_rsc_op: Performing key=1:14:0:989206b7-461a-42db-a2a7-7b447bd6c5b3 op=d_mysql_monitor_10000 ) Jul 22 15:34:13 sipl-mysql-109 lrmd: [11182]: info: rsc:d_mysql:70: monitor Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: te_rsc_command: Initiating action 8: start ip_db_start_0 on sipl-mysql-109 (local) Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: do_lrm_rsc_op: Performing key=8:14:0:989206b7-461a-42db-a2a7-7b447bd6c5b3 op=ip_db_start_0 ) Jul 22 15:34:13 sipl-mysql-109 lrmd: [11182]: info: rsc:ip_db:71: start Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: process_lrm_event: LRM operation d_mysql_monitor_10000 (call=70, rc=7, cib-update=106, confirmed=false) not running Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: WARN: status_from_rc: Action 1 (d_mysql_monitor_10000) on sipl-mysql-109 failed (target: 0 vs. rc: 7): Error Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: WARN: update_failcount: Updating failcount for d_mysql on sipl-mysql-109 after failed monitor. The output of crm configure show is: primitive d_mysql ocf:heartbeat:mysql \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ op monitor interval="10" timeout="30" depth="0" param binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" datadir="/var/lib/mysql" user="mysql" pid="/var/run/mysqld/mysql.pid" socket="/var/lib/mysql/mysql.sock" primitive ip_db ocf:heartbeat:IPaddr2 \ params ip="10.200.131.9" cidr_netmask="32" \ op monitor interval="30s" nic="eth0" group sv_db d_mysql ip_db property $id="cib-bootstrap-options" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ start-failure-is-fatal="false" \ expected-quorum-votes="2" \ dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ cluster-infrastructure="Heartbeat" rsc_defaults $id="rsc_defaults-options" \ migration-threshold="20" \ failure-timeout="20" My versions are as follows: [r...@sipl-mysql-109 rc0.d]# rpm -qa | egrep "coro|pacemaker|heart" corosynclib-1.2.5-1.3.el5 corosync-1.2.5-1.3.el5 corosync-1.2.5-1.3.el5 heartbeat-3.0.3-2.3.el5 pacemaker-1.0.9.1-1.11.el5 pacemaker-1.0.9.1-1.11.el5 corosynclib-1.2.5-1.3.el5 heartbeat-libs-3.0.3-2.3.el5 heartbeat-3.0.3-2.3.el5 pacemaker-libs-1.0.9.1-1.11.el5 heartbeat-libs-3.0.3-2.3.el5 pacemaker-libs-1.0.9.1-1.11.el5 rpm -qa | grep resource resource-agents-1.0.3-2.6.el5 [r...@sipl-mysql-109 rc0.d]# cat /etc/redhat-release CentOS release 5.5 (Final) [r...@sipl-mysql-109 rc0.d]# uname -r 2.6.18-194.8.1.el5 [r...@sipl-mysql-109 rc0.d]# mysql -V mysql Ver 14.14 Distrib 5.1.48, for unknown-linux-gnu (x86_64) using readline 5.1 My ha.cf looks like: autojoin none mcast eth0 227.0.0.10 694 1 0 warntime 5 deadtime 15 initdead 60 keepalive 5 auto_failback off node sipl-mysql-109 node sipl-mysql-209 crm on Mysql show the following in it's error log: 100722 15:33:57 [Note] Plugin 'FEDERATED' is disabled. 100722 15:33:57 InnoDB: Started; log sequence number 0 44233 100722 15:33:57 [Note] Event Scheduler: Loaded 0 events 100722 15:33:57 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.1.48-community-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL) 100722 15:34:01 [Note] /usr/sbin/mysqld: Normal shutdown 100722 15:34:01 [Note] Event Scheduler: Purging the queue. 0 events 100722 15:34:01 InnoDB: Starting shutdown... 100722 15:34:02 InnoDB: Shutdown completed; log sequence number 0 44233 100722 15:34:02 [Note] /usr/sbin/mysqld: Shutdown complete 100722 15:34:02 mysqld_safe mysqld from pid file /var/run/mysql/mysqld.pid ended 100722 15:34:03 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 100722 15:34:03 [Warning] '--skip-locking' is deprecated and will be removed in a future release. Please use '--skip-external-locking' instead. 100722 15:34:03 [Note] Plugin 'FEDERATED' is disabled. 100722 15:34:03 InnoDB: Started; log sequence number 0 44233 100722 15:34:03 [Note] Event Scheduler: Loaded 0 events 100722 15:34:03 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.1.48-community-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL) Any help would be greatly appreciated. Thanks in advance. F.
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker