Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-06-04 Thread Andrew Beekhof
On 23/05/2013, at 12:33 PM, Andrew Beekhof wrote: > Please keep the bug reports coming in. There is a good chances that > this will be the final release candidate and 1.1.10 will be tagged on > May 30th. I am delaying rc4 until we can get definitive closure on the crmd memory leak(s). Valgrin

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Denis Witt
Am 05.06.2013 um 04:04 schrieb Andrew Beekhof : But no resources are started, so I suspect there really is quorum. >>> >>> Can you send me the output of cibadmin -Ql please? >>> Perhaps those two resources are blocked for other reasons. > > It looks like you may have hit a bug in an older

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 12:15 PM, Denis Witt wrote: > > Am 05.06.2013 um 04:04 schrieb Andrew Beekhof : > > But no resources are started, so I suspect there really is quorum. Can you send me the output of cibadmin -Ql please? Perhaps those two resources are blocked for other rea

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 11:55 AM, Denis Witt wrote: > > Am 05.06.2013 um 03:34 schrieb Andrew Beekhof : > >>> But no resources are started, so I suspect there really is quorum. >> >> Can you send me the output of cibadmin -Ql please? >> Perhaps those two resources are blocked for other reasons. I

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Denis Witt
Am 05.06.2013 um 03:34 schrieb Andrew Beekhof : >> But no resources are started, so I suspect there really is quorum. > > Can you send me the output of cibadmin -Ql please? > Perhaps those two resources are blocked for other reasons. Hi Andrew, here we go:

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 10:43 AM, Denis Witt wrote: > > Am 05.06.2013 um 02:15 schrieb Andrew Beekhof : > >>> Jun 5 01:11:06 test4 pengine: [18625]: WARN: cluster_status: We do not >>> have quorum - fencing and resource management disabled >>> Jun 5 01:11:06 test4 pengine: [18625]: notice: LogAc

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Denis Witt
Am 05.06.2013 um 02:15 schrieb Andrew Beekhof : >> Jun 5 01:11:06 test4 pengine: [18625]: WARN: cluster_status: We do not have >> quorum - fencing and resource management disabled >> Jun 5 01:11:06 test4 pengine: [18625]: notice: LogActions: Start >> pingtest:0#011(test4 - blocked) >> Jun

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 9:22 AM, Denis Witt wrote: > > Am 05.06.2013 um 00:52 schrieb Andrew Beekhof : > >>> been restored the resources aren't restarted. Running crm_resource -P >>> brings anything up, but of course it would be nice if this happens >>> automatically. Is there any way to archive th

Re: [Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-04 Thread renayama19661014
Hi Andrew, > Yep, sounds like a problem. > I'll follow up on bugzilla All right! Many Thanks! Hideo Yamauchi. --- On Tue, 2013/6/4, Andrew Beekhof wrote: > > On 04/06/2013, at 3:00 PM, renayama19661...@ybb.ne.jp wrote: > > > > > It is right movement that recognize other nodes in a UNCLEAN

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Denis Witt
Am 05.06.2013 um 00:52 schrieb Andrew Beekhof : >> been restored the resources aren't restarted. Running crm_resource -P >> brings anything up, but of course it would be nice if this happens >> automatically. Is there any way to archive this? > > It should happen automatically. > Logs? Hi Andre

Re: [Pacemaker] DRBD into standalone mode when failover

2013-06-04 Thread Andrew Beekhof
On 04/06/2013, at 11:35 PM, Weihua JIANG wrote: > Hi all, > > I want a typical active/passive mode HA solution. > > My Pacemaker configuration as below: > 3 Nodes: > node Lezbxh0jl > node Ljn74rici > node L472nxxdy (standby) > The 3rd node L472nxxdy is only used for quorum election. So, I forc

Re: [Pacemaker] Recovery after lost quorum

2013-06-04 Thread Andrew Beekhof
On 05/06/2013, at 2:13 AM, Denis Witt wrote: > Hi List, > > I have a cluster with two nodes running services, to make the Cluster > more reliable I added a third node with no services (I didn't start > pacemaker there, only corosync). I can't use STONITH in my setup so I > choose no-quorum-pol

Re: [Pacemaker] owership of created symlink

2013-06-04 Thread Lars Ellenberg
On Tue, Jun 04, 2013 at 07:15:11PM +0200, andreas graeper wrote: > hi, > i tried, before starting dovecot+exim+fetchmail > > to create a symlink > /var/mail -> /mnt/mirror/var/mail > with ra ocf:heartbeat:symlink > > i changed target : > chmod 0775 > chown root.mail > > but i need write perm

[Pacemaker] owership of created symlink

2013-06-04 Thread andreas graeper
hi, i tried, before starting dovecot+exim+fetchmail to create a symlink /var/mail -> /mnt/mirror/var/mail with ra ocf:heartbeat:symlink i changed target : chmod 0775 chown root.mail but i need write permission to /var/mail cause exim wants to create a lock file i tried to manually chown -h

[Pacemaker] Recovery after lost quorum

2013-06-04 Thread Denis Witt
Hi List, I have a cluster with two nodes running services, to make the Cluster more reliable I added a third node with no services (I didn't start pacemaker there, only corosync). I can't use STONITH in my setup so I choose no-quorum-policy=stop to avoid data corruption on my DRBD-Resources. The s

Re: [Pacemaker] Shutdown of pacemaker service takes 20 minutes

2013-06-04 Thread Johan Huysmans
Hi, Adding a timeout for the stop() operation worked like a charm. Thanks for the input! gr. Johan On 30-05-13 14:20, Florian Crouzat wrote: Le 30/05/2013 13:57, Johan Huysmans a écrit : When my resource has received the stop command, it will stop, but this takes some time. When the status

Re: [Pacemaker] Troube mounting filesystem (DRBD)

2013-06-04 Thread emmanuel segura
Hello Denis I'm glad you solved 2013/6/4 Denis Witt > On Tue, 4 Jun 2013 15:38:57 +0200 > Denis Witt wrote: > > > I'm trying to setup a Apache/DRBD cluster, but the Filesystem isn't > > mounted. crm status always tells me "not installed" as status for the > > filesystem primitive. Mounting th

Re: [Pacemaker] Troube mounting filesystem (DRBD)

2013-06-04 Thread Denis Witt
On Tue, 4 Jun 2013 15:38:57 +0200 Denis Witt wrote: > I'm trying to setup a Apache/DRBD cluster, but the Filesystem isn't > mounted. crm status always tells me "not installed" as status for the > filesystem primitive. Mounting the filesystem by hand works fine. Hi List, I got it fixed. fuser wa

Re: [Pacemaker] Troube mounting filesystem (DRBD)

2013-06-04 Thread Denis Witt
On Tue, 4 Jun 2013 15:48:28 +0200 emmanuel segura wrote: > Did you tried to mount the filesystem manualy, without the cluster? Hi Emmanuel, yes, I did, works fine. Best regards Denis Witt ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org htt

Re: [Pacemaker] lsb resource manager

2013-06-04 Thread Florian Crouzat
Le 04/06/2013 11:55, andreas graeper a écrit : but pacemaker should realize that lsb:xxx did not start ?! what is to do ? maybe the init scripts return is not correct ?! Check your init-script against lsb compliance: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html/Pacemaker_Explained

Re: [Pacemaker] more newbie questions

2013-06-04 Thread Florian Crouzat
Le 04/06/2013 06:44, Alex Samad - Yieldbroker a écrit : but I am looking for info on the op start op monito op stop where can I find that. Googling show these in examples but doesn't explain them. Operations are not tied to a resource agent but are generic: start, stop, monitor, and eventual

Re: [Pacemaker] Troube mounting filesystem (DRBD)

2013-06-04 Thread emmanuel segura
Hello Denis Did you tried to mount the filesystem manualy, without the cluster? Thanks 2013/6/4 Denis Witt > Hi List, > > I'm trying to setup a Apache/DRBD cluster, but the Filesystem isn't > mounted. crm status always tells me "not installed" as status for the > filesystem primitive. Mountin

[Pacemaker] Troube mounting filesystem (DRBD)

2013-06-04 Thread Denis Witt
Hi List, I'm trying to setup a Apache/DRBD cluster, but the Filesystem isn't mounted. crm status always tells me "not installed" as status for the filesystem primitive. Mounting the filesystem by hand works fine. Here is my config: root@test3:~# crm configure show node test3 node test4 primitive

[Pacemaker] DRBD into standalone mode when failover

2013-06-04 Thread Weihua JIANG
Hi all, I want a typical active/passive mode HA solution. My Pacemaker configuration as below: 3 Nodes: node Lezbxh0jl node Ljn74rici node L472nxxdy (standby) The 3rd node L472nxxdy is only used for quorum election. So, I forced it to enter standby mode to avoid resource migrated to it. The reso

Re: [Pacemaker] different os on nodes

2013-06-04 Thread Michael Schwartzkopff
Am Dienstag, 4. Juni 2013, 12:00:25 schrieb andreas graeper: > hi, > target-system are identical machines with centos, but i am playing on > debian-lubuntu to learn/test. > debian: nfs-common + nfs-kernel-server > ubuntu: nfs > > is there a way to handle such differences. or is it simply a bad ide

Re: [Pacemaker] Pacemaker still may include memory leaks

2013-06-04 Thread Yuichi SEINO
2013/6/4 Andrew Beekhof : > > On 03/06/2013, at 8:55 PM, Yuichi SEINO wrote: > >> Hi, >> >> I run the test after we updated pacemaker. >> >> I tested the same way as the previous test. However, I think that the >> memory leak still may be caused. >> >> I attached the result(smaps and crm_mon and e

[Pacemaker] different os on nodes

2013-06-04 Thread andreas graeper
hi, target-system are identical machines with centos, but i am playing on debian-lubuntu to learn/test. debian: nfs-common + nfs-kernel-server ubuntu: nfs is there a way to handle such differences. or is it simply a bad idea to connect different systems. thanks in advance andreas

[Pacemaker] lsb resource manager

2013-06-04 Thread andreas graeper
hi, i am on debian7. nfs-kernel-server is not started cause there are no exports pacemaker does not realize this, tells nfs-common and nfs-kernel-server are started, but ocf:heartbeat:exportfs monitor is in fault-list. i changed nfs-kernel-server init script: if there is /etc/exports.d/force then

Re: [Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-04 Thread Andrew Beekhof
On 04/06/2013, at 3:00 PM, renayama19661...@ybb.ne.jp wrote: > > It is right movement that recognize other nodes in a UNCLEAN state in the > node that rebooted, but seems to recognize it by mistake. > > It is like the problem of Pacemaker somehow or other. > * There seems to be the problem wit