[Pacemaker] umount filesystem the node reboot

2011-01-19 Thread jiaju liu
I use lustre in my HA-cluster,sometimes when I umount ost, the node will reboot.an 20 09:23:10 oss1 kernel: Lustre: server umount testTwo-OST0003 completeJan 20 10:28:56 oss1 kernel: Lustre: server umount testTwo-OST completeJan 20 10:28:58 oss1 kernel: exit dynlocks cacheJan 20 10:29:11 oss

[Pacemaker] Xen on two node DRBD cluster with Pacemaker

2011-01-19 Thread Bart Coninckx
Hi all, could somebody point me to what is considered a sound way to offer Xen guests on a two node DRBD cluster in combination with Pacemaker? I prefer block devices over images for the DomU's. I understand that for live migration DRBD 8.3 is needed, but I'm not sure as to what kind of resourc

Re: [Pacemaker] Problem with rrp_mode

2011-01-19 Thread Mike Caldwell
> > On Wed, Jan 19, 2011 at 3:43 PM, Michael Schwartzkopff > wrote: > > Hi, > > > > I have two network cards and configured corosync-1.2.7 with > > rrp_mode: active > > > > Anybody being successful at all using rrp_mode with corosync? > > > > Greetings. > > > > -- > > Dr. Michael Schwartzkopff > >

Re: [Pacemaker] Best way to split a resource across two locations, four nodes

2011-01-19 Thread Andy Smith
Hi Andrew, On Wed, Jan 19, 2011 at 10:12:30AM +0100, Andrew Beekhof wrote: > On Tue, Jan 18, 2011 at 7:15 AM, Andy Smith wrote: > > Should an entire suite go dark (e.g. power fail of whole room), > > How do you know "they" had a power failure and "we" didn't loose our > own connectivity? > > As

Re: [Pacemaker] Problem with rrp_mode

2011-01-19 Thread Dan Frincu
Hi, Andrew Beekhof wrote: I dont think rrp is well tested by upstream. You might want to ask on the corosync ML to be sure. On Wed, Jan 19, 2011 at 3:43 PM, Michael Schwartzkopff wrote: Hi, I have two network cards and configured corosync-1.2.7 with rrp_mode: active at first corosync-cfg

Re: [Pacemaker] Problem with rrp_mode

2011-01-19 Thread Andrew Beekhof
I dont think rrp is well tested by upstream. You might want to ask on the corosync ML to be sure. On Wed, Jan 19, 2011 at 3:43 PM, Michael Schwartzkopff wrote: > Hi, > > I have two network cards and configured corosync-1.2.7 with > rrp_mode: active > > at first corosync-cfg -s tells me > Printing

[Pacemaker] Problem with rrp_mode

2011-01-19 Thread Michael Schwartzkopff
Hi, I have two network cards and configured corosync-1.2.7 with rrp_mode: active at first corosync-cfg -s tells me Printing ring status. Local node ID 1210452490 RING ID 0 id = 10.10.38.72 status = ring 0 active with no faults RING ID 1 id = 10.10.40.115

[Pacemaker] MySQL configuration/error Pacemaker

2011-01-19 Thread Shurbann Martes
Hi Pacemaker guru's I have installed pacemaker successfully and I have setup a cluster with the following configuration: node tsparxdb01 node tsparxdb02 primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="192.168.1.254" cidr_netmask="32" \ op monitor interval="30s" \ meta target-role="Started"

Re: [Pacemaker] Master Slave resource

2011-01-19 Thread Andrew Beekhof
Actually this is the correct patch. Sorry for the delay. diff -r 3a1cab4892a4 pengine/master.c --- a/pengine/master.c Fri Jan 14 11:23:56 2011 +0100 +++ b/pengine/master.c Wed Jan 19 13:12:50 2011 +0100 @@ -299,10 +299,12 @@ static void master_promotion_order(resou * master instance sho

Re: [Pacemaker] Slowing down resource migration

2011-01-19 Thread Robert van Leeuwen
> > Is there anyway to configure Pacemaker to migrate a resource at a rate of 1 > > resource every 2 minutes e.g.? > > No. But you could modify the resource agent and insert a 2 minute > sleep in the start action. Dejan, Thanks for the suggestion, I was also thinking about this however it woul

Re: [Pacemaker] Master Slave resource

2011-01-19 Thread Andrew Beekhof
On Wed, Jan 19, 2011 at 9:47 AM, Andrew Beekhof wrote: > On Sat, Dec 25, 2010 at 12:44 AM, ruslan usifov > wrote: >> >> >> 2010/12/20 Andrew Beekhof >>> >>> Actually the libxml2 guy and I figured out that problem just now. >>> I can't find any memory corruption, but on the plus side, I does see

Re: [Pacemaker] Slowing down resource migration

2011-01-19 Thread Dejan Muhamedagic
Hi, On Wed, Jan 19, 2011 at 11:22:23AM +0100, Robert van Leeuwen wrote: > Hi, > > Is it possible to slow down migration of resources? > > In this case a large amount of resources (vservers) are managed by Pacemaker. > The resource is started within a second, however a resource will create a >

Re: [Pacemaker] start resource timeout

2011-01-19 Thread Dejan Muhamedagic
On Wed, Jan 19, 2011 at 02:38:51PM +0800, jiaju liu wrote: > I use lustre filesystem in cluster,By default, the start, stop, and monitor > operations in a Filesystem resource > time out after 20 sec. Since some mounts in Lustre require up to 5 minutes or > more,so, the default timeouts for these

[Pacemaker] Slowing down resource migration

2011-01-19 Thread Robert van Leeuwen
Hi, Is it possible to slow down migration of resources? In this case a large amount of resources (vservers) are managed by Pacemaker. The resource is started within a second, however a resource will create a tremendous load on on the backend storage. Is there anyway to configure Pacemaker to m

Re: [Pacemaker] Bug in ptest -s for promotion scores?

2011-01-19 Thread Andrew Beekhof
Might be too old, it was quite recent On Wed, Jan 19, 2011 at 10:37 AM, Michael Schwartzkopff wrote: > On Wednesday 19 January 2011 09:52:46 Andrew Beekhof wrote: >> On Wed, Jan 19, 2011 at 9:37 AM, Michael Schwartzkopff >> >> wrote: >> > Hi, >> > >> > I have pacemaker-1.0.10 on opensuse11.3 fro

Re: [Pacemaker] Bug in ptest -s for promotion scores?

2011-01-19 Thread Michael Schwartzkopff
On Wednesday 19 January 2011 09:52:46 Andrew Beekhof wrote: > On Wed, Jan 19, 2011 at 9:37 AM, Michael Schwartzkopff > > wrote: > > Hi, > > > > I have pacemaker-1.0.10 on opensuse11.3 from the clusterlabs repo > > installed. > > > > When I have a ms DRBD resource with a filesystem depending on

Re: [Pacemaker] Colocation of multistate resource and group

2011-01-19 Thread Evgeniy Ivanov
On Wed, Jan 19, 2011 at 10:20 AM, Florian Haas wrote: > On 01/18/2011 10:15 PM, Evgeniy Ivanov wrote: Is it expected (PM 1.0.3)? What's correct way to achieve this? >>> >>> Consider upgrading. >> >> Yeah, I will. But for now I need to make it work on 1.0.3. > > IIRC there were a _bunch_ of is

Re: [Pacemaker] Master/Slave DRBD switch caused some problems

2011-01-19 Thread Lars Ellenberg
On Tue, Dec 21, 2010 at 06:35:04PM +0100, Marc Wilmots wrote: > Hi, > > I have two nodes rspa and rspa2 (both Centos 5.3 32bits) with the following > packages: > > drbd83-8.3.8-1.el5.centos Not sure what exactly that is, but if it is equivalent to "8.3.8", not "8.3.8.1", as tagged in git, then t

Re: [Pacemaker] Best way to split a resource across two locations, four nodes

2011-01-19 Thread Andrew Beekhof
On Tue, Jan 18, 2011 at 7:15 AM, Andy Smith wrote: > Hi, > > After having only used heartbeat a few years ago I'm now starting to > look at the newer version, with pacemaker. I'm running the 3.0.3-2 > Debian packages on either lenny-backports or squeeze. > > This is all new to me so I'd be gratefu

Re: [Pacemaker] dampen / pingd - how to be sure pingd will be updated

2011-01-19 Thread Andrew Beekhof
On Tue, Jan 18, 2011 at 1:46 AM, Thomas Guthmann wrote: > Hey, > >>> This may be dumb or obvious but it took me a long time to understand why my >>> pingd with >>> dampen never got updated! So I think that this may be useful for everybody >>> to share. >>> In short: >>> >>>   " You MUST define a

Re: [Pacemaker] Bug in ptest -s for promotion scores?

2011-01-19 Thread Andrew Beekhof
On Wed, Jan 19, 2011 at 9:37 AM, Michael Schwartzkopff wrote: > Hi, > > I have pacemaker-1.0.10 on opensuse11.3 from the clusterlabs repo installed. > > When I have a ms DRBD resource with a filesystem depending on the master state > of the DRBD and look at the scores I get: > > # ptest -sL > (...

Re: [Pacemaker] Master Slave resource

2011-01-19 Thread Andrew Beekhof
On Sat, Dec 25, 2010 at 12:44 AM, ruslan usifov wrote: > > > 2010/12/20 Andrew Beekhof >> >> Actually the libxml2 guy and I figured out that problem just now. >> I can't find any memory corruption, but on the plus side, I does seem >> that 1.1 is unaffected - perhaps you could try that until I ma

Re: [Pacemaker] symmetric (anti-)collocation

2011-01-19 Thread Andrew Beekhof
On Tue, Dec 21, 2010 at 11:08 AM, Zrin Žiborski wrote: > Hi there, > > I've experimented a bit with pacemaker and as far as I can tell (without > looking > into the source code enough to distinguish a feature from a potential > problem), > the effect of >     colocation X-Y : X Y > is (sometimes s

[Pacemaker] Bug in ptest -s for promotion scores?

2011-01-19 Thread Michael Schwartzkopff
Hi, I have pacemaker-1.0.10 on opensuse11.3 from the clusterlabs repo installed. When I have a ms DRBD resource with a filesystem depending on the master state of the DRBD and look at the scores I get: # ptest -sL (...) resDRBD:0 promotion score on hakurs27: 10020 resDRBD:1 promotion score on h

Re: [Pacemaker] Master/Slave DRBD switch caused some problems

2011-01-19 Thread Andrew Beekhof
On Tue, Dec 21, 2010 at 6:35 PM, Marc Wilmots wrote: > Hi, > > I have two nodes rspa and rspa2 (both Centos 5.3 32bits) with the following > packages: > > drbd83-8.3.8-1.el5.centos > heartbeat-3.0.3-2.3.el5 > pacemaker-1.0.10-1.4.el5 > > rspa is stopped, and rspa2 has all the resources (IP, FileSy

Re: [Pacemaker] It affects it that the update of the attribute by attrd is late, and a resource starts with a standby node.

2011-01-19 Thread Andrew Beekhof
Catching up on old email... I see you've filed a bug for this one, I'll follow up there. On Thu, Dec 2, 2010 at 2:16 AM, wrote: > Hi Andrew, > >> > Step1) 192.168.40.3 addresses invalidate the understanding of ping. >> >> Not sure I understand this, can you rephrase? > > Sorry > > For pingd,

Re: [Pacemaker] [Linux-HA] Unordered groups (was Re: Is 'resource_set' still experimental?)

2011-01-19 Thread Andrew Beekhof
On Tue, Jan 18, 2011 at 1:42 PM, Florian Haas wrote: > On 01/18/2011 11:49 AM, RaSca wrote: >> As discussed yesterday on IRC with Andrew, there is no way of creating a >> group with indipendent resources. >> I was hoping that setting the options you mentioned can do the trick, >> but I've just tes

Re: [Pacemaker] pacemaker-1.0.10 - failure of resource causes migration of other resources

2011-01-19 Thread Andrew Beekhof
On Wed, Dec 29, 2010 at 11:15 PM, Nikola Ciprich wrote: > Hello, > while doing some tests of 1.0.10, I've noticed strange behaviour. It does sound strange. Could you file a bug for this please (so it doesn't get lost)? > I have lots of mutualy unrelated resources (virtual machines) + some other