Hi All,

First of all, thanks for the brilliant documentation at clusterlabs and the alteeva.ca tutorials! They helped me out a lot.

I am relatively new to pacemaker but come from a Solaris background with cluster experience, I am now trying to get on board with pacemaker

I have setup a 2 node cluster with a shared lun using pacemaker, cman, dlm, clvmd and gfs. I have configured 2 stonith devices each to fence either node.

The issue I have is that when i test an unclean shutdown of the 2nd node, pacemaker goes ahead and fences the second node, but clvmd then goes in to a failed state on node 1 and then it fences itself (shuts down node 1).

I suspect it has something to do with me setting the on-fail=fence for the dlm/clvmd services/RA's. DLM appears to be fine, but clvmd is the one that goes in to a failed state. I suspect I have an issue with timeouts here, but, being new to pacemaker I cannot see where, I am hoping a new pair of eyes can see where I am going wrong here.

I am running, CentOS 6.5 in vmware, using the fence_vmware_soap stonith agents. Pacemaker is at version 1.1.10-14, CMAN is at version 3.0.12.1-59.

I used the following tutorial to assist me in setting up dlm/clmvd/gfs2 on CentOS 6.5 (if it helps in the debugging)

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/7-Beta/html/Global_File_System_2/ch-clustsetup-GFS2.html

Any assistance, tips, tricks, comments, criticisms are all welcome

I have attached my cluster.conf if required, some node name obfuscation has been done. If you need any additional info, please dont hesitate to ask.

Thanks
<cluster config_version="12" name="sftp-cluster">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="test01" nodeid="1">
      <altname name="test01-alt"/>
      <fence>
        <method name="pcmk-redirect">
          <device delay="15" name="pcmk" port="test01"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="test02" nodeid="2">
      <altname name="test02-alt"/>
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="test02"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman keyfile="/etc/corosync/authkey" port="5405" transport="udpu"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
  <totem rrp_mode="active"/>
</cluster>
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to