On Wed, Jun 23, 2010 at 5:19 PM, Koch, Sebastian <sebastian.k...@netzwerk.de> wrote: > Hi, > > > > i got a 2 Node Cluster up and running and right know i am trying to > configure a Nagios3 Resource. Therefore i already fixed the nagios init > script as it dind’t pass the LSB Compatibility Checks as described here: > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html > > > > I just needed to make sure the pid file gets removed if the stop function is > called. After this small change i passed all the LSB Checks. Below you find > the error message: > > > > r...@pilot01-node2:/var/run/nagios3# crm_verify -LV > > crm_verify[7094]: 2010/06/23_16:37:27 ERROR: unpack_rsc_op: Hard error - > res_Nagios_monitor_0 failed with rc=6: Preventing res_Nagios from > re-starting anywhere in the cluster
Looks like its still failing the fifth LSB check from the above url. "Did the command print result: 3" > > crm_verify[7094]: 2010/06/23_16:37:27 WARN: native_color: Resource > res_Nagios cannot run anywhere > > Warnings found during check: config may not be valid > > > > I tried to find out what the init scripts must provide for allowing it to > use it in pacemaker but i just found the LSB Compatib. Hints on the > pacemaker website. I think i configured the primitive wrong or maybe the > init script is still wrong? Even if i configure it with a op monitor action > it fails. And even a crm resource cleanup res_Nagios doesn’t help me > starting the resource. > > > > I can run Nagios manually on the active node. I linked all shared > directories to my cluster storage device like this: > > > > r...@pilot01-node2:/etc# ll /var/lib/nagios3* /etc/nagios* > > lrwxrwxrwx 1 root root 25 23. Jun 13:54 /etc/nagios3 -> > /mnt/cluster/etc/nagios3/ > > lrwxrwxrwx 1 root root 29 23. Jun 14:04 /var/lib/nagios3 -> > /mnt/cluster/var/lib/nagios3/ > > > > /etc/nagios3_bak: > > insgesamt 88K > > drwxr-xr-x 4 root root 146 23. Jun 13:54 . > > drwxr-xr-x 75 root root 4,0K 23. Jun 17:08 .. > > -rw-r--r-- 1 root root 1,9K 30. Jun 2009 apache2.conf > > -rw-r--r-- 1 root root 11K 23. Jun 13:49 cgi.cfg > > -rw-r--r-- 1 root root 2,4K 2. Jul 2009 commands.cfg > > drwxr-xr-x 2 root root 4,0K 7. Jun 19:16 conf.d > > -rw-r--r-- 1 root root 20 23. Jun 13:49 htpasswd.users > > -rw-r--r-- 1 root root 42K 2. Jul 2009 nagios.cfg > > -rw-r----- 1 root nagios 1,3K 30. Jun 2009 resource.cfg > > drwxr-xr-x 2 root root 4,0K 7. Jun 19:16 stylesheets > > > > /etc/nagios-plugins: > > insgesamt 12K > > drwxr-xr-x 3 root root 19 7. Jun 19:16 . > > drwxr-xr-x 75 root root 4,0K 23. Jun 17:08 .. > > drwxr-xr-x 2 root root 4,0K 7. Jun 19:16 config > > > > /var/lib/nagios3_bak: > > insgesamt 20K > > drwxr-x--- 4 nagios nagios 47 23. Jun 14:02 . > > drwxr-xr-x 33 root root 4,0K 23. Jun 14:04 .. > > -rw------- 1 nagios www-data 14K 23. Jun 14:02 retention.dat > > drwx------ 2 nagios www-data 6 2. Jul 2009 rw > > drwxr-x--- 3 nagios nagios 25 7. Jun 19:16 spool > > > > Here is my Config. > > > > ######################## > > ### 3. Cluster State ### > > ######################## > > > > ============ > > Last updated: Wed Jun 23 17:16:33 2010 > > Stack: openais > > Current DC: pilot01-node2 - partition with quorum > > Version: 1.0.8-2c98138c2f070fcb6ddeab1084154cffbf44ba75 > > 2 Nodes configured, 2 expected votes > > 4 Resources configured. > > ============ > > > > Node pilot01-node1: standby > > Online: [ pilot01-node2 ] > > > > Full list of resources: > > > > Resource Group: grp_MySQL > > res_Filesystem (ocf::heartbeat:Filesystem): Started > pilot01-node2 > > res_ClusterIP (ocf::heartbeat:IPaddr2): Started > pilot01-node2 > > res_MySQL (lsb:mysql): Started pilot01-node2 > > res_Apache (lsb:apache2): Started pilot01-node2 > > res_ClusterMonitor (ocf::pacemaker:ClusterMon): Started > pilot01-node2 > > res_Nagios (lsb:nagios3): Stopped > > Master/Slave Set: ms_drbd_mysql0 > > Masters: [ pilot01-node2 ] > > Stopped: [ drbd_pilot0:0 ] > > Clone Set: cl-pinggw > > Started: [ pilot01-node2 ] > > Stopped: [ pinggw:0 ] > > Monitor-Cluster (ocf::pacemaker:ClusterMon): Started pilot01-node1 > (unmanaged) FAILED > > > > Failed actions: > > Monitor-Cluster_stop_0 (node=pilot01-node1, call=34, rc=1, > status=complete): unknown error > > res_Nagios_monitor_0 (node=pilot01-node1, call=84, rc=6, > status=complete): not configured > > ######################### > > ### 4. Cluster Config ### > > ######################### > > > > node pilot01-node1 \ > > attributes standby="on" > > node pilot01-node2 \ > > attributes standby="off" > > primitive Monitor-Cluster ocf:pacemaker:ClusterMon \ > > params htmlfile="/mnt/cluster/var/www/cluster-monitor.html" \ > > params pidfile="/var/run/rlb-cluster-monitor.pid" \ > > op start interval="0" timeout="90s" \ > > op stop interval="0" timeout="100s" > > primitive drbd_pilot0 ocf:linbit:drbd \ > > params drbd_resource="pilot0" \ > > operations $id="drbd_pilot0-operations" \ > > op monitor interval="15s" > > primitive pinggw ocf:pacemaker:pingd \ > > params host_list="10.1.1.162" multiplier="200" \ > > op monitor interval="10s" > > primitive res_Apache lsb:apache2 \ > > operations $id="res_Apache-operations" \ > > op monitor interval="15s" timeout="20s" start-delay="15s" > > primitive res_ClusterIP ocf:heartbeat:IPaddr2 \ > > params iflabel="ClusterIP" ip="10.1.1.12" nic="eth0" > cidr_netmask="24" \ > > operations $id="res_ClusterIP_1-operations" \ > > op monitor start-delay="0" interval="10s" > > primitive res_ClusterMonitor ocf:pacemaker:ClusterMon \ > > params htmlfile="/mnt/cluster/var/www/cluster-monitor.html" \ > > params pidfile="/var/run/rlb-cluster-monitor.pid" \ > > op start interval="0" timeout="90s" \ > > op stop interval="0" timeout="100s" \ > > meta target-role="Started" > > primitive res_Filesystem ocf:heartbeat:Filesystem \ > > params fstype="xfs" directory="/mnt/cluster" device="/dev/drbd0" > options="noatime,nodiratime,barrier=0" > > primitive res_MySQL lsb:mysql > > primitive res_Nagios lsb:nagios3 \ > > operations $id="res_Nagios-operations" \ > > op monitor interval="15s" timeout="20s" \ > > meta target-role="Started" > > group grp_MySQL res_Filesystem res_ClusterIP res_MySQL res_Apache > res_ClusterMonitor res_Nagios > > ms ms_drbd_mysql0 drbd_pilot0 \ > > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" > > clone cl-pinggw pinggw \ > > meta globally-unique="false" > > location drbd-fence-by-handler-ms_drbd_mysql0 ms_drbd_mysql0 \ > > rule $id="drbd-fence-by-handler-rule-ms_drbd_mysql0" $role="Master" > -inf: #uname ne pilot01-node2 > > location grp_MySQL-with-pinggw grp_MySQL \ > > rule $id="grp_MySQL-with-pinggw-rule-1" -inf: not_defined pingd or > pingd lte 0 > > colocation col_drbd_on_mysql inf: grp_MySQL ms_drbd_mysql0:Master > > order mysql_after_drbd inf: ms_drbd_mysql0:promote grp_MySQL:start > > property $id="cib-bootstrap-options" \ > > expected-quorum-votes="2" \ > > stonith-enabled="false" \ > > no-quorum-policy="ignore" \ > > dc-version="1.0.8-2c98138c2f070fcb6ddeab1084154cffbf44ba75" \ > > cluster-infrastructure="openais" \ > > last-lrm-refresh="1277306106" \ > > symmetric-cluster="true" \ > > migration-threshold="1" \ > > default-action-timeout="240s" > > > > Thanks for your help in advance. > > Sebastian > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker