Hi, On Sep 12, 2012, at 6:28 PM, Lars Marowsky-Bree wrote: > On 2012-09-12T18:01:25, Waldemar Brodkorb <m...@waldemar-brodkorb.de> wrote: > >> Is there no way to handle a power outage of xen01 (virtual box poweroff >> button), when stonith is disabled? >> Actually xvm-01 resource can not be started on xen02, because /cluster is >> not accessible on xen02. >> (ls -la /cluster is hanging endlessly, it works when I power on xen01 again) > > You can use a manual fencing ACK.
What does this means? In the meanwhile I found the -f 0 option for dlm_controld.pcmk. After activating this option in the ocf script "controld" and restart of both nodes, I finally can recover from a power outage of one node. No OCFS2 hanging anymore. I can now set the node which runs the virtual machine resource to standby and the virtual machine is automatically started on the other node. Bringing it back to online works, too. When powering the machine off, the failover works, too. But when the dead machine comes back, I get following error message when trying to mount the cluster filesystem: root@xen01:~# mount /dev/drbd/by-res/cluster-ocfs /cluster mount.ocfs2: Transport endpoint is not connected while mounting /dev/drbd0 on /cluster. Check 'dmesg' for more information on this error. root@xen01:~# dmesg [ 394.187654] dlm: no local IP address has been set [ 394.188460] dlm: cannot start dlm lowcomms -107 [ 394.191062] (mount.ocfs2,4647,0):ocfs2_dlm_init:3001 ERROR: status = -107 [ 394.194428] (mount.ocfs2,4647,0):ocfs2_mount_volume:1879 ERROR: status = -107 [ 394.201157] ocfs2: Unmounting device (147,0) on (node 0) [ 394.201167] (mount.ocfs2,4647,0):ocfs2_fill_super:1234 ERROR: status = -107 root@xen01:~# /etc/init.d/corosync stop Stopping corosync daemon: corosync. root@xen01:~# /etc/init.d/corosync start Starting corosync daemon: corosync. root@xen01:~# mount |grep cluster root@xen01:~# root@xen01:~# crm resource list|grep Mount Clone Set: Cluster-FS-Mount-Clone [Cluster-FS-Mount] root@xen01:~# crm resource cleanup Cluster-FS-Mount-Clone Cleaning up Cluster-FS-Mount:0 on xen01 Cleaning up Cluster-FS-Mount:0 on xen02 Cleaning up Cluster-FS-Mount:1 on xen01 Cleaning up Cluster-FS-Mount:1 on xen02 Waiting for 5 replies from the CRMd..... OK root@xen01:~# mount |grep cluster /dev/drbd0 on /cluster type ocfs2 (rw,relat Strange, isn't it. But may be you are right, playing around with OCFS2 without fencing is not worth the pain. BTW: the crm_gui is running fine on MacOSX. (hackish compiled, but working) best regards Waldemar > But I'd not even bother with OCFS2 (or GFS2, for the matter) if you > don't have fencing. It's not worth the pain. > > You could use SBD, but since you're running OCFS2 on top of DRBD, you > can't. For your lab setup though you could use a third VM with an iSCSI > target as the storage back-end. > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org