Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Karl Rößmann
Thank you, that was the solution, now our stonith-timeout is 160s our SBD Timeouts still are Timeout (watchdog) : 60 Timeout (msgwait) : 120 Yes they are long to avoid any problems with multipath driver. We found similar recommended values in the latest SuSE SLES HA Guide. Karl On 2011-

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Lars Marowsky-Bree
On 2011-05-12T15:16:52, Karl Rößmann wrote: > This is an Update to my last Mail: > > SBD is running on one Node normally: I didn't mean to inquire wrt the external/sbd fencing agent, but the system daemon "sbd" - as configured via /etc/sysconfig/sbd and started (automatically) via /etc/init.d/o

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Karl Rößmann
This is an Update to my last Mail: SBD is running on one Node normally: Online: [ multix246 multix244 multix245 ] Clone Set: dlm_clone [dlm] Started: [ multix244 multix245 multix246 ] Clone Set: clvm_clone [clvm] Started: [ multix244 multix245 multix246 ] Clone Set: vgsmet_clone [v

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Karl Rößmann
This Time I switched off multix246: after some time I have: sbd -d /dev/disk/by-id/scsi-3600a0b8000420d5a1cf14dc3a9a2-part1 list 0 multix244 clear 1 multix245 clear 2 multix246 reset multix245 in our configuration SBD is running on one node only, we

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Lars Marowsky-Bree
On 2011-05-12T09:51:21, Karl Rößmann wrote: > Hi David, > > > startup-fencing is true > stonith is enabled > stonith-timeout is 60s > stonith-action is reboot > > We have a Fibre Channel SAN with multipath driver as common device > for the Volume Groups. > > I have SBD Stonith > -

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-12 Thread Karl Rößmann
Hi David, startup-fencing is true stonith is enabled stonith-timeout is 60s stonith-action is reboot We have a Fibre Channel SAN with multipath driver as common device for the Volume Groups. I have SBD Stonith --- This is the SBD Setting: -- multix244:~ # s

Re: [Pacemaker] Cluster Volume Group is stuck

2011-05-11 Thread David Coulson
On 5/11/11 8:07 AM, Karl Rößmann wrote: we have a three node cluster with a Cluster Volume Group vgsmet. After powering off one Node, the Volume Group is stuck. One of the ERROR messages is: May 11 10:50:32 multix244 crmd: [8086]: ERROR: process_lrm_event: LRM operation vgsmet:0_monitor_600