Hi,
On Sep 12, 2012, at 6:28 PM, Lars Marowsky-Bree wrote:

> On 2012-09-12T18:01:25, Waldemar Brodkorb <m...@waldemar-brodkorb.de> wrote:
> 
>> Is there no way to handle a power outage of xen01 (virtual box poweroff 
>> button), when stonith is disabled?
>> Actually xvm-01 resource can not be started on xen02, because /cluster is 
>> not accessible on xen02. 
>> (ls -la /cluster is hanging endlessly, it works when I power on xen01 again)
> 
> You can use a manual fencing ACK.

What does this means? 

In the meanwhile I found the -f 0 option for dlm_controld.pcmk. After 
activating this option in the ocf script "controld"
and restart of both nodes, I finally can recover from a power outage of one 
node. No OCFS2 hanging anymore.

I can now set the node which runs the virtual machine resource to standby and 
the virtual machine is automatically
started on the other node. Bringing it back to online works, too. 

When powering the machine off, the failover works, too. But when the dead 
machine comes back, I get following
error message when trying to mount the cluster filesystem:
root@xen01:~# mount /dev/drbd/by-res/cluster-ocfs /cluster
mount.ocfs2: Transport endpoint is not connected while mounting /dev/drbd0 on 
/cluster. Check 'dmesg' for more information on this error.
root@xen01:~# dmesg
[  394.187654] dlm: no local IP address has been set
[  394.188460] dlm: cannot start dlm lowcomms -107
[  394.191062] (mount.ocfs2,4647,0):ocfs2_dlm_init:3001 ERROR: status = -107
[  394.194428] (mount.ocfs2,4647,0):ocfs2_mount_volume:1879 ERROR: status = -107
[  394.201157] ocfs2: Unmounting device (147,0) on (node 0)
[  394.201167] (mount.ocfs2,4647,0):ocfs2_fill_super:1234 ERROR: status = -107
root@xen01:~# /etc/init.d/corosync stop
Stopping corosync daemon: corosync.
root@xen01:~# /etc/init.d/corosync start
Starting corosync daemon: corosync.
root@xen01:~# mount |grep cluster
root@xen01:~# 
root@xen01:~# crm resource list|grep Mount
 Clone Set: Cluster-FS-Mount-Clone [Cluster-FS-Mount]
root@xen01:~# crm resource cleanup Cluster-FS-Mount-Clone
Cleaning up Cluster-FS-Mount:0 on xen01
Cleaning up Cluster-FS-Mount:0 on xen02
Cleaning up Cluster-FS-Mount:1 on xen01
Cleaning up Cluster-FS-Mount:1 on xen02
Waiting for 5 replies from the CRMd..... OK
root@xen01:~# mount |grep cluster
/dev/drbd0 on /cluster type ocfs2 (rw,relat

Strange, isn't it. But may be you are right, playing around with OCFS2 without 
fencing is not worth the pain.
 
BTW: the crm_gui is running fine on MacOSX. (hackish compiled, but working) 

best regards
 Waldemar

> But I'd not even bother with OCFS2 (or GFS2, for the matter) if you
> don't have fencing. It's not worth the pain.
> 
> You could use SBD, but since you're running OCFS2 on top of DRBD, you
> can't. For your lab setup though you could use a third VM with an iSCSI
> target as the storage back-end.
> 
> 
> Regards,
>    Lars
> 
> -- 
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, 
> HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to