Hi List, I'm having trouble getting OCFS2 running. If I run everything by hand the OCFS-Drive works quite well, but cluster integration doesn't work at all.
The Status: ============ Last updated: Tue Jun 25 17:00:49 2013 Last change: Tue Jun 25 16:58:03 2013 via crmd on test4 Stack: openais Current DC: test4 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 16 Resources configured. ============ Node test4: standby Online: [ test4-node1 test4-node2 ] Master/Slave Set: ms_drbd [drbd] Masters: [ test4-node1 test4-node2 ] Clone Set: clone_pingtest [pingtest] Started: [ test4-node2 test4-node1 ] Stopped: [ pingtest:2 ] Failed actions: p_o2cb:0_monitor_0 (node=test4-node2, call=20, rc=5, status=complete): not installed p_o2cb:1_monitor_0 (node=test4-node1, call=20, rc=5, status=complete): not installed drbd:0_monitor_0 (node=test4, call=98, rc=5, status=complete): not installed p_controld:0_monitor_0 (node=test4, call=99, rc=5, status=complete): not installed p_o2cb:0_monitor_0 (node=test4, call=100, rc=5, status=complete): not installed My Config: node test4 \ attributes standby="on" node test4-node1 node test4-node2 primitive apache ocf:heartbeat:apache \ params configfile="/etc/apache2/apache2.conf" \ op monitor interval="10" timeout="15" \ meta target-role="Started" primitive drbd ocf:linbit:drbd \ params drbd_resource="drbd0" primitive fs_drbd ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/var/www" fstype="ocfs2" primitive p_controld ocf:pacemaker:controld primitive p_o2cb ocf:pacemaker:o2cb primitive pingtest ocf:pacemaker:ping \ params multiplier="1000" host_list="10.0.0.1" \ op monitor interval="5s" primitive sip ocf:heartbeat:IPaddr2 \ params ip="10.0.0.18" nic="eth0" \ op monitor interval="10" timeout="20" \ meta target-role="Started" group g_ocfs2mgmt p_controld p_o2cb group grp_all sip apache ms ms_drbd drbd \ meta master-max="2" clone-max="2" clone cl_fs_ocfs2 fs_drbd \ meta target-role="Started" clone cl_ocfs2mgmt g_ocfs2mgmt \ meta interleave="true" clone clone_pingtest pingtest location loc_all_on_best_ping grp_all \ rule $id="loc_all_on_best_ping-rule" -inf: not_defined pingd or pingd lt 1000 colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt ms_drbd:Master colocation coloc_all_on_drbd inf: grp_all ms_drbd:Master order order_all_after_drbd inf: ms_drbd:promote cl_ocfs2mgmt:start cl_fs_ocfs2:start grp_all:start property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="false" \ default-resource-stickiness="100" \ maintenance-mode="false" \ last-lrm-refresh="1372172283" test4 is a quorum-node. My system is Debian Wheezy. I installed the following packages: dlm-pcmk, ocfs2-tools, ocfs2-tools-pacemaker, openais My drbd.conf: ### globale Angaben ### global { # an Statistikauswertung auf usage.drbd.org teilnehmen? usage-count no; } ### Optionen, die an alle Ressourcen vererbt werden ### common { syncer { rate 33M; } } ### Ressourcenspezifische Optionen resource drbd0 { # Protokoll-Version protocol C; startup { # Timeout (in Sekunden) für Verbindungsherstellung beim Start wfc-timeout 60; # Timeout (in Sekunden) für Verbindungsherstellung beim Start # nach vorheriger Feststellung von Dateninkonsistenz # ("degraded mode") degr-wfc-timeout 120; become-primary-on both; } disk { # Aktion bei EA-Fehlern: Laufwerk aushängen on-io-error pass_on; fencing resource-only; } net { ### Verschiedene Netzwerkoptionen, die normalerweise nicht gebraucht werden, ### ### die HA-Verbindung sollte generell möglichst performant sein... ### # timeout 60; # connect-int 10; # ping-int 10; # max-buffers 2048; # max-epoch-size 2048; allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } syncer { # Geschwindigkeit der HA-Verbindung rate 33M; } on test4-node1 { ### Optionen für Master-Server ### # Name des bereitgestellten Blockdevices device /dev/drbd0; # dem DRBD zugrunde liegendes Laufwerk disk /dev/xvda3; # Adresse und Port, über welche die Synchr. läuft address 10.0.2.18:7788; # Speicherort der Metadaten, hier im Laufwerk selbst meta-disk internal; } on test4-node2 { ## Optionen für Slave-Server # Name des bereitgestellten Blockdevices device /dev/drbd0; # dem DRBD zugrunde liegendes Laufwerk disk /dev/xvda3; # Adresse und Port, über welche die Synchr. läuft address 10.0.3.18:7788; # Speicherort der Metadaten, hier im Laufwerk selbst meta-disk internal; } } My cluster.conf (I added this later to be able to run tunefs.ocfs --update-cluster-stack): node: ip_port = 7777 ip_address = 10.0.2.18 number = 0 name = test4-node1 cluster = ocfs2 node: ip_port = 7777 ip_address = 10.0.3.18 number = 1 name = test4-node2 cluster = ocfs2 cluster: node_count = 2 name = ocfs2 When I run crm resource cleanup cl_ocfs2mgmt the following output is generated in the syslog: Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_ais_dispatch: Update relayed from test4-node1 Jun 25 17:06:51 test4 cib: [28585]: info: apply_xml_diff: Digest mis-match: expected 495da536e77edc25bb5cc043ff9ec9b9, calculated 012c9d5cd9c0939c4d52a3ff9efcbbd9 Jun 25 17:06:51 test4 cib: [28585]: notice: cib_process_diff: Diff 0.66.3 -> 0.66.4 not applied to 0.66.3: Failed application of an update diff Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.3 -> 0.66.4 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.3 -> 0.66.4 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.4 -> 0.66.5 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.5 -> 0.66.6 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.6 -> 0.66.7 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.6 -> 0.66.7 not applied to 0.66.3: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.7 -> 0.66.8 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.7 -> 0.66.8 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.8 -> 0.66.9 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.9 -> 0.66.10 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.10 -> 0.66.11 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.11 -> 0.66.12 not applied to 0.66.3: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.12 -> 0.66.13 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.13 -> 0.66.14 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.14 -> 0.66.15 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.15 -> 0.66.16 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.16 -> 0.66.17 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.17 -> 0.66.18 not applied to 0.66.3: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.18 -> 0.66.19 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.19 -> 0.66.20 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.20 -> 0.66.21 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.21 -> 0.66.22 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.22 -> 0.66.23 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.23 -> 0.66.24 not applied to 0.66.3: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.24 -> 0.66.25 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.25 -> 0.66.26 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.26 -> 0.66.27 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.27 -> 0.66.28 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.28 -> 0.66.29 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_replace_notify: Replaced: -1.-1.-1 -> 0.66.29 from test4 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (1000) Jun 25 17:06:51 test4 cib: [28585]: info: apply_xml_diff: Digest mis-match: expected d9449ea751be0805ea05b579edec0c5f, calculated 6a1ac691bc7bd96d64e6aae08e4abcc1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-drbd:0 (1372172208) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_process_diff: Diff 0.66.29 -> 0.66.30 not applied to 0.66.29: Failed application of an update diff Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 crmd: [28590]: info: delete_resource: Removing resource p_controld:0 for 20836_crm_resource (internal) on test4-node1 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000) Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_controld:0 was deleted Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:1 (10000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.29 -> 0.66.30 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.30 -> 0.66.31 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.31 -> 0.66.32 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.32 -> 0.66.33 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.33 -> 0.66.34 (sync in progress) Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 cib: [28585]: info: cib_replace_notify: Replaced: -1.-1.-1 -> 0.66.34 from test4 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (1000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-drbd:0 (1372172208) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:1 (10000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Jun 25 17:06:51 test4 cib: [28585]: info: apply_xml_diff: Digest mis-match: expected 533575a1c8419df099d85489fad38c97, calculated 8484583447f5f2ffcda69692a601872a Jun 25 17:06:51 test4 cib: [28585]: notice: cib_process_diff: Diff 0.66.45 -> 0.66.46 not applied to 0.66.45: Failed application of an update diff Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 crmd: [28590]: info: delete_resource: Removing resource p_o2cb:0 for 20836_crm_resource (internal) on test4-node1 Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_o2cb:0 was deleted Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: notice: do_lrm_invoke: Not creating resource for a delete event: (null) Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_controld:1 was deleted Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.45 -> 0.66.46 (sync in progress) Jun 25 17:06:51 test4 crmd: [28590]: notice: do_lrm_invoke: Not creating resource for a delete event: (null) Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_o2cb:1 was deleted Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: notice: do_lrm_invoke: Not creating resource for a delete event: (null) Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_controld:2 was deleted Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.46 -> 0.66.47 (sync in progress) Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 crmd: [28590]: notice: do_lrm_invoke: Not creating resource for a delete event: (null) Jun 25 17:06:51 test4 crmd: [28590]: info: notify_deleted: Notifying 20836_crm_resource on test4-node1 that p_o2cb:2 was deleted Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.47 -> 0.66.48 (sync in progress) Jun 25 17:06:51 test4 crmd: [28590]: WARN: decode_transition_key: Bad UUID (crm-resource-20836) in sscanf result (3) for 0:0:crm-resource-20836 Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.48 -> 0.66.49 (sync in progress) Jun 25 17:06:51 test4 crmd: [28590]: info: ais_dispatch_message: Membership 24: quorum retained Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.49 -> 0.66.50 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.50 -> 0.66.51 not applied to 0.66.45: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.51 -> 0.66.52 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.52 -> 0.66.53 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.53 -> 0.66.54 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.54 -> 0.66.55 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.55 -> 0.66.56 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_process_diff: Diff 0.66.56 -> 0.66.57 not applied to 0.66.45: current "num_updates" is less than required Jun 25 17:06:51 test4 cib: [28585]: info: cib_server_process_diff: Requesting re-sync from peer Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.57 -> 0.66.58 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: notice: cib_server_process_diff: Not applying diff 0.66.58 -> 0.66.59 (sync in progress) Jun 25 17:06:51 test4 cib: [28585]: info: cib_replace_notify: Replaced: -1.-1.-1 -> 0.66.59 from test4 Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (1000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-drbd:0 (1372172208) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:1 (10000) Jun 25 17:06:51 test4 attrd: [28588]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Jun 25 17:06:51 test4 lrmd: [28587]: info: rsc:p_controld:0 probe[22] (pid 466) Jun 25 17:06:51 test4 lrmd: [28587]: info: rsc:p_o2cb:0 probe[23] (pid 467) Jun 25 17:06:52 test4 lrmd: [28587]: info: RA output: (p_controld:0:probe:stderr) dlm_controld.pcmk: no process found Jun 25 17:06:52 test4 lrmd: [28587]: info: operation monitor[22] on p_controld:0 for client 28590: pid 466 exited with return code 7 Jun 25 17:06:52 test4 crmd: [28590]: info: process_lrm_event: LRM operation p_controld:0_monitor_0 (call=22, rc=7, cib-update=152, confirmed=true) not running Jun 25 17:06:52 test4 o2cb[467]: ERROR: Wrong stack o2cb Jun 25 17:06:52 test4 lrmd: [28587]: info: operation monitor[23] on p_o2cb:0 for client 28590: pid 467 exited with return code 5 Jun 25 17:06:52 test4 crmd: [28590]: info: process_lrm_event: LRM operation p_o2cb:0_monitor_0 (call=23, rc=5, cib-update=153, confirmed=true) not installed Any ideas? Best regards Denis Witt _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org