Re: [Pacemaker] Timeout after nodejoin

2010-09-22 Thread Steven Dake
On 09/22/2010 05:43 AM, Dan Frincu wrote: Hi all, I have the following packages: # rpm -qa | grep -i "(openais|cluster|heartbeat|pacemaker|resource)" openais-0.80.5-15.2 cluster-glue-1.0-12.2 pacemaker-1.0.5-4.2 cluster-glue-libs-1.0-12.2 resource-agents-1.0-31.5 pacemaker-libs-1.0.5-4.2 pacema

Re: [Pacemaker] Connection to our AIS plugin (9) failed: Library error

2010-09-22 Thread Steven Dake
On 09/22/2010 04:02 AM, Szymon Hersztek wrote: Wiadomość napisana w dniu 2010-09-22, o godz. 10:26, przez Andrew Beekhof: 2010/9/21 Szymon Hersztek : Wiadomość napisana w dniu 2010-09-21, o godz. 09:08, przez Andrew Beekhof: 2010/9/21 Szymon Hersztek : Wiadomość napisana w dniu 2010-09-2

Re: [Pacemaker] Help to configure a cluster with pacemaker

2010-09-22 Thread Josué Alcalde González
Ok. Now I understand how this work. The lsb script did not work correctly (because conf files where in the drbd disk and it is only mounted in one server). Thank you very much. It is working now. El mar, 21-09-2010 a las 08:55 +0200, Andrew Beekhof escribió: > 2010/9/2 Josué Alcalde González

Re: [Pacemaker] Add a resource with commandline to an existing?group?

2010-09-22 Thread Andrew Beekhof
On Wed, Sep 22, 2010 at 4:34 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Aug 31, 2010 at 12:13:15PM +, Rainer wrote: >> Dan Frincu writes: >> >> > >> > You can update the config by typing: crm configure >> > This puts you in the crm shell configure mode. Then you type in edit, >> > that op

Re: [Pacemaker] migration-threshold and failure-timeout

2010-09-22 Thread Andrew Beekhof
On Wed, Sep 22, 2010 at 2:12 PM, Michael Smith wrote: > On Wed, 22 Sep 2010, Andrew Beekhof wrote: > >> On Tue, Sep 21, 2010 at 3:28 PM, Vadym Chepkov wrote: >> > On Tue, Sep 21, 2010 at 9:14 AM, Dan Frincu wrote: > >> >> However I don't know of any automatic method to clear the failcount. >> >

Re: [Pacemaker] Add a resource with commandline to an existing?group?

2010-09-22 Thread Dejan Muhamedagic
Hi, On Tue, Aug 31, 2010 at 12:13:15PM +, Rainer wrote: > Dan Frincu writes: > > > > > You can update the config by typing: crm configure > > This puts you in the crm shell configure mode. Then you type in edit, > > that opens a vi session with the config, you edit the group entry by > >

Re: [Pacemaker] Timeout after nodejoin

2010-09-22 Thread Dejan Muhamedagic
Hi, On Wed, Sep 22, 2010 at 04:48:42PM +0300, Dan Frincu wrote: > Hi, > > Raoul Bhatia [IPAX] wrote: > >hi, > > > >On 09/22/2010 02:43 PM, Dan Frincu wrote: > >>When I start openais, I get nodejoin immediately, as seen in the logs > >>below. However, it takes some time before the nodes are visibl

Re: [Pacemaker] correct permissions for /var/lib/pengine

2010-09-22 Thread Dejan Muhamedagic
Hi, On Wed, Sep 22, 2010 at 12:44:45PM +0200, Raoul Bhatia [IPAX] wrote: > in my recent hb_report, i find: > > WARN: problem with permissions/ownership at wc01: > > wrong permissions or ownership for /var/lib/pengine: > > drwxr-xr-x 2 hacluster haclient 5038080 Jul 23 08:58 /var/lib/pengine > > WA

Re: [Pacemaker] Timeout after nodejoin

2010-09-22 Thread Dan Frincu
Hi, Raoul Bhatia [IPAX] wrote: hi, On 09/22/2010 02:43 PM, Dan Frincu wrote: When I start openais, I get nodejoin immediately, as seen in the logs below. However, it takes some time before the nodes are visible in crm_mon output. Any idea how to minimize this delay? Sep 22 15:27:24 bench1

Re: [Pacemaker] Timeout after nodejoin

2010-09-22 Thread Raoul Bhatia [IPAX]
hi, On 09/22/2010 02:43 PM, Dan Frincu wrote: > When I start openais, I get nodejoin immediately, as seen in the logs > below. However, it takes some time before the nodes are visible in > crm_mon output. Any idea how to minimize this delay? > > Sep 22 15:27:24 bench1 openais[12935]: [crm ] info

[Pacemaker] Re: Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Chen Stormstout
Raoul, Thank you for the attention. These are the information that you asked for after remove madkiss. aptitude -t lenny-backports install heartbeat pacemaker, still with the same error message. apt-cache policy cluster-glue pacemaker heartbeat cluster-glue: Installed: (none) Candidate: 1.

Re: [Pacemaker] Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Raoul Bhatia [IPAX]
hi, On 09/22/2010 02:52 PM, Chen Stormstout wrote: > Hi, > > Thanks for your post Raoul, but read this tutorial is what i made at first > place. ok. > # For the cluster > deb http://backports.debian.org/debian-backports lenny-backports main contrib > non-free > deb http://www.backports.org/de

Re: [Pacemaker] Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Chen Stormstout
Hi, Thanks for your post Raoul, but read this tutorial is what i made at first place. This is my entire sources.list: #deb cdrom:[Debian GNU/Linux 5.0.5 _Lenny_ - Official i386 DVD Binary-1 20100626-17:50]/ lenny contrib main deb cdrom:[Debian GNU/Linux 5.0.5 _Lenny_ - Official i386 DVD Binary

[Pacemaker] Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Chen Stormstout
Hi, Thanks for your post Raoul, but read this tutorial is what i made at first place. This is my entire sources.list: #deb cdrom:[Debian GNU/Linux 5.0.5 _Lenny_ - Official i386 DVD Binary-1 20100626-17:50]/ lenny contrib main deb cdrom:[Debian GNU/Linux 5.0.5 _Lenny_ - Official i386 DVD Binary

[Pacemaker] Timeout after nodejoin

2010-09-22 Thread Dan Frincu
Hi all, I have the following packages: # rpm -qa | grep -i "(openais|cluster|heartbeat|pacemaker|resource)" openais-0.80.5-15.2 cluster-glue-1.0-12.2 pacemaker-1.0.5-4.2 cluster-glue-libs-1.0-12.2 resource-agents-1.0-31.5 pacemaker-libs-1.0.5-4.2 pacemaker-mgmt-1.99.2-7.2 libopenais2-0.80.5-15.2

Re: [Pacemaker] migration-threshold and failure-timeout

2010-09-22 Thread Michael Smith
On Wed, 22 Sep 2010, Andrew Beekhof wrote: > On Tue, Sep 21, 2010 at 3:28 PM, Vadym Chepkov wrote: > > On Tue, Sep 21, 2010 at 9:14 AM, Dan Frincu wrote: > >> However I don't know of any automatic method to clear the failcount. > > in pacemaker 1.0 nothing will clean failcount automatically, th

Re: [Pacemaker] Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Raoul Bhatia [IPAX]
hi, please refer to http://www.clusterlabs.org/wiki/Debian_Lenny_HowTo cheers, raoul -- DI (FH) Raoul Bhatia M.Sc. email. r.bha...@ipax.at Technischer Leiter IPAX - Aloy Bhatia Hava OG web.

[Pacemaker] Problems Installing Pacemaker and Heartbeat

2010-09-22 Thread Chen Stormstout
Hi, I'm facing problems to install pacemaker and heartbeat on debian lenny: What I did: - Downloaded the least debian image for i386 (Kernel 2.6.26-2-686) - after install configure sources.list: deb http://backports.debian.org/debian-backports lenny-backports main contrib non-free - run apt-

Re: [Pacemaker] Connection to our AIS plugin (9) failed: Library error

2010-09-22 Thread Szymon Hersztek
Wiadomość napisana w dniu 2010-09-22, o godz. 10:26, przez Andrew Beekhof: 2010/9/21 Szymon Hersztek : Wiadomość napisana w dniu 2010-09-21, o godz. 09:08, przez Andrew Beekhof: 2010/9/21 Szymon Hersztek : Wiadomość napisana w dniu 2010-09-21, o godz. 08:34, przez Andrew Beekhof:

[Pacemaker] correct permissions for /var/lib/pengine

2010-09-22 Thread Raoul Bhatia [IPAX]
in my recent hb_report, i find: > WARN: problem with permissions/ownership at wc01: > wrong permissions or ownership for /var/lib/pengine: > drwxr-xr-x 2 hacluster haclient 5038080 Jul 23 08:58 /var/lib/pengine > WARN: problem with permissions/ownership at wc02: > wrong permissions or ownership for

Re: [Pacemaker] chkconfig values in MCP init script (again)

2010-09-22 Thread Vladislav Bogdanov
22.09.2010 11:17, Andrew Beekhof wrote: > On Tue, Sep 21, 2010 at 2:24 PM, Vladislav Bogdanov > wrote: >> Hi Andrew, hi all. >> >> I decided to return to this issue again because of issues with >> libvirt/KVM virtual domains controlled by pacemaker. >> >> libvirt package on Fedora 13 has two init

[Pacemaker] Release Matrix

2010-09-22 Thread Raoul Bhatia [IPAX]
hi, regarding the Release Matrix [1] and the ABI-change in cluster-glue/ clplumbing [2], i wonder if pacemaker 1.0.9.1 really works with glue 1.0.3? cheers, raoul [1] http://www.clusterlabs.org/wiki/ReleaseMatrix [2] http://www.gossamer-threads.com/lists/linuxha/pacemaker/65443 -- _

Re: [Pacemaker] About behavior in "Action Lost".

2010-09-22 Thread renayama19661014
Hi Andrew, Thank you for comment. > A long time ago in a galaxy far away, some messaging layers used to > loose quite a few actions, including stops. > About the same time, we decided that fencing because a stop action was > lost wasn't a good idea. > > The rationale was that if the operation eve

Re: [Pacemaker] About behavior in "Action Lost".

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 8:59 AM, wrote: > Hi, > > Node was in state that the load was very high, and we confirmed monitor > movement of Pacemeker. > Action Lost occurred in stop movement after the error of the monitor occurred. > > Sep  8 20:02:22 cgl54 crmd: [3507]: ERROR: print_elem: Aborting

Re: [Pacemaker] re: Pacemaker Digest, Vol 34, Issue 50

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 10:33 AM, jiaju liu wrote: > > > --- *10年9月21日,周二, pacemaker-requ...@oss.clusterlabs.org < > pacemaker-requ...@oss.clusterlabs.org>* 写道: > > > Message: 5 > Date: Tue, 21 Sep 2010 09:15:16 +0200 > From: Andrew Beekhof > http://cn.mc157.mail.yahoo.com/mc/compose?to=and...@b

Re: [Pacemaker] error: ocf:heartbeat:IPv6addr: could not parse meta-data

2010-09-22 Thread Angelo Höngens
On 22-9-2010 9:58, Andrew Beekhof wrote: >> Can you please help me in my quest to the desired end result? (which is >> the knowledge to build an ipv6-enabled version of the resource-agents so >> I can install it on my nodes, and I can rebuild it after each version >> upgrade of the source package).

Re: [Pacemaker] node pending problem

2010-09-22 Thread Andrew Beekhof
Looks like most of the pacemaker processes got suck before exex(). What does "ps axf" show and what versions of corosync and pacemaker are involved here? On Tue, Sep 21, 2010 at 9:01 AM, jiaju liu wrote: > > > --- *10年9月21日,周二, jiaju liu * 写道: > > > 发件人: jiaju liu > 主题: re:re:node pending probl

Re: [Pacemaker] Connection to our AIS plugin (9) failed: Library error

2010-09-22 Thread Andrew Beekhof
2010/9/21 Szymon Hersztek : > > Wiadomość napisana w dniu 2010-09-21, o godz. 09:08, przez Andrew Beekhof: > >> 2010/9/21 Szymon Hersztek : >>> >>> Wiadomość napisana w dniu 2010-09-21, o godz. 08:34, przez Andrew >>> Beekhof: >>> On Mon, Sep 20, 2010 at 3:34 PM, Szymon Hersztek wrote: >

Re: [Pacemaker] chkconfig values in MCP init script (again)

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 2:24 PM, Vladislav Bogdanov wrote: > Hi Andrew, hi all. > > I decided to return to this issue again because of issues with > libvirt/KVM virtual domains controlled by pacemaker. > > libvirt package on Fedora 13 has two init scripts: libvirtd and > libvirt-guests. > They hav

Re: [Pacemaker] error: ocf:heartbeat:IPv6addr: could not parse meta-data

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 4:10 PM, Angelo Höngens wrote: > On 25-8-2010 8:36, Andrew Beekhof wrote: >>> I guess whoever packaged the rpm's can answer why the file is missing, >>> but was that someone from the clusterlabs team or someone from the >>> linux-ha team? :) >> >> Basically because I left o

Re: [Pacemaker] monitor operation cancel question

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 8:58 PM, Phil Armstrong wrote: > Hi, > > This is my first post to this list so if I'm doing this wrong, please be > patient. I am using pacemaker-1.1.2-0.2.1 on sles11sp1. Thanks in advance > for any help anyone can give me. Well, fixing this is a good start: Sep 21 10:

Re: [Pacemaker] migration-threshold and failure-timeout

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 3:28 PM, Vadym Chepkov wrote: > On Tue, Sep 21, 2010 at 9:14 AM, Dan Frincu wrote: >> Hi, >> >> This => >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-failure-migration.html >> explains it pretty well. Notice the INFINITY score and what se

Re: [Pacemaker] help with configuration for Xen domU on two DRBD devices

2010-09-22 Thread Jai
To answer my own email. Just incase it helps someone else. After a bit more research and trying different things, it appears that perhaps my issue was because of a resource failcount. Once I manually cleared all/any resource failcounts it started to work properly. ___