Hello all.

 

I'm implementing a 2-node cluster using Corosync/Pacemaker/DRBD/OCFS2
for dual-primary shared FS.

 

I've followed the instructions on the DRBD applications site and it
works really well.

 

However, if I 'pull the plug' on a node, the other node continues to
operate the clones, but the filesystem is locked and inaccessible (the
monitor op works for the filesystem, but fails for the OCFS2 resource.)

 

If I do a reboot one node, there are no problems and I can continue to
access the OCFS2 FS.

 

After I pull the plug:

 

Online: [ test-odp-02 ]

OFFLINE: [ test-odp-01 ]

 

Resource Group: Load-Balancing

     Virtual-IP-ODP     (ocf::heartbeat:IPaddr2):       Started
test-odp-02

     Virtual-IP-ODPWS   (ocf::heartbeat:IPaddr2):       Started
test-odp-02

     ldirectord (ocf::heartbeat:ldirectord):    Started test-odp-02

Master/Slave Set: ms_drbd_ocfs2 [p_drbd_ocfs2]

     Masters: [ test-odp-02 ]

     Stopped: [ p_drbd_ocfs2:1 ]

Clone Set: cl-odp [odp]

     Started: [ test-odp-02 ]

     Stopped: [ odp:1 ]

Clone Set: cl-odpws [odpws]

     Started: [ test-odp-02 ]

     Stopped: [ odpws:1 ]

Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]

     Started: [ test-odp-02 ]

     Stopped: [ p_fs_ocfs2:1 ]

Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]

     Started: [ test-odp-02 ]

     Stopped: [ g_ocfs2mgmt:1 ]

 

Failed actions:

    p_o2cb:0_monitor_10000 (node=test-odp-02, call=19, rc=-2,
status=Timed Out): unknown

exec error

 

 

test-odp-02:~ # mount

/dev/drbd0 on /opt/odp type ocfs2
(rw,_netdev,noatime,cluster_stack=pcmk)

 

test-odp-02:~ # ls /opt/odp

...just hangs forever...

 

If I then power test-odp-01 back on, everything fails back fine and the
ls command suddenly completes.

 

It seems to me that OCFS2 is trying to talk to the node that has
disappeared and doesn't time out. Does anyone have any ideas? (attached
CRM and DRBD configs)

 

Many thanks.

 

Darren Mansell



 

Attachment: drbd.conf
Description: drbd.conf

node test-odp-01
node test-odp-02 \
        attributes standby="off"
primitive Virtual-IP-ODP ocf:heartbeat:IPaddr2 \
        params lvs_support="true" ip="2.21.15.100" cidr_netmask="8" 
broadcast="2.255.255.255" \
        op monitor interval="1m" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive Virtual-IP-ODPWS ocf:heartbeat:IPaddr2 \
        params lvs_support="true" ip="2.21.15.103" cidr_netmask="8" 
broadcast="2.255.255.255" \
        op monitor interval="1m" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive ldirectord ocf:heartbeat:ldirectord \
        params configfile="/etc/ha.d/ldirectord.cf" \
        op monitor interval="2m" timeout="20s" \
        meta migration-threshold="10" failure-timeout="600"
primitive odp lsb:odp \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive odpwebservice lsb:odpws \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive p_controld ocf:pacemaker:controld \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive p_drbd_ocfs2 ocf:linbit:drbd \
        params drbd_resource="r0" \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive p_fs_ocfs2 ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/r0" directory="/opt/odp" fstype="ocfs2" 
options="rw,noatime" \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
primitive p_o2cb ocf:ocfs2:o2cb \
        op monitor interval="10s" enabled="true" timeout="10s" \
        meta migration-threshold="10" failure-timeout="600"
group Load-Balancing Virtual-IP-ODP Virtual-IP-ODPWS ldirectord
group g_ocfs2mgmt p_controld p_o2cb
ms ms_drbd_ocfs2 p_drbd_ocfs2 \
        meta master-max="2" clone-max="2" notify="true"
clone cl-odp odp
clone cl-odpws odpws
clone cl_fs_ocfs2 p_fs_ocfs2 \
        meta target-role="Started"
clone cl_ocfs2mgmt g_ocfs2mgmt \
        meta interleave="true"
location Prefer-Node1 ldirectord \
        rule $id="prefer-node1-rule" 100: #uname eq test-odp-01
order o_ocfs2 inf: ms_drbd_ocfs2:promote cl_ocfs2mgmt:start cl_fs_ocfs2:start
order tomcatlast1 inf: cl_fs_ocfs2 cl-odp
order tomcatlast2 inf: cl_fs_ocfs2 cl-odpws
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-5bd2b9154d7d9f86d7f56fe0a74072a5a6590c60" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        no-quorum-policy="ignore" \
        start-failure-is-fatal="false" \
        stonith-action="reboot" \
        stonith-enabled="false" \
        last-lrm-refresh="1317207361"
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to