On Sun, Jan 30, 2011 at 11:33 AM, paul harford wrote:
> HI Guys
> Can anyone help? i have ipfail configured in my ha.cf and when i check the
> ha-log it can see the ping ipaddress is gone but the resources do not
> failover
>
> i have tried pingd also but i don't think i had configured it properly
HI Guys
Can anyone help? i have ipfail configured in my ha.cf and when i check the
ha-log it can see the ping ipaddress is gone but the resources do not
failover
i have tried pingd also but i don't think i had configured it properly
(which is why i went to ipfail)
Does anyone have a working pingd
if you want to.
> > monitor operation timeout=60s
> >
> > BTW, someone should really implement the fping based ping RA ...
>
> Thankyou for volunteering :-)
:-P
Date: Fri, 3 Sep 2010 12:12:58 +0200
From: Bernd Schubert
On Tue, Jan 11, 2011 at 2:45 PM, Lars Ellenberg
wrote:
> On Tue, Jan 11, 2011 at 11:24:35AM +0100, patrik.rappo...@knapp.com wrote:
>> we already made changes to the interval and timeout (> id="pingd-op-monitor-30s" interval="30s" name="monitor" timeout="10s"/>).
>>
>> how big should dampen be set
cemaker cluster resource manager
An
pacemaker@oss.clusterlabs.org
Kopie
Thema
Re: [Pacemaker] pingd process dies for no reason
On Tue, Jan 11, 2011 at 11:24:35AM +0100, patrik.rappo...@knapp.com wrote:
> we already made changes to the interval and timeout ( id="pingd-op-monitor-30s"
On Tue, Jan 11, 2011 at 11:24:35AM +0100, patrik.rappo...@knapp.com wrote:
> we already made changes to the interval and timeout ( id="pingd-op-monitor-30s" interval="30s" name="monitor" timeout="10s"/>).
>
> how big should dampen be set?
>
> please correct me, if i am wrong, as i calculate it as
we already made changes to the interval and timeout ().
how big should dampen be set?
please correct me, if i am wrong, as i calculate it as following:
assuming the last check was ok and in the next second, the failures takes
place:
then we there would be 29s till the next check will start, and
On Friday 07 January 2011 14:56:03 patrik.rappo...@knapp.com wrote:
> Greetings,
>
> we have a problem, that the ping daemon dies for no reason and we can't
> find why this happened.
>
> we use following versions on SLES 11.1:
>
> libpacemaker3-1.1.2-0.6.1
> pacemaker-mgmt-2.0.0-0.3.10
> pacemak
Greetings,
we have a problem, that the ping daemon dies for no reason and we can't
find why this happened.
we use following versions on SLES 11.1:
libpacemaker3-1.1.2-0.6.1
pacemaker-mgmt-2.0.0-0.3.10
pacemaker-mgmt-client-2.0.0-0.3.10
drbd-pacemaker-8.3.8.1-0.2.9
libpacemaker-devel-1.1.2-0.6.
Hi,
I just want to let you know that the problem is solved. Thanks to "crm_mon
-f". ;)
The error was the usage of the pingd RA. This RA worked very unreliable and
is marked as deprecated though (
http://comments.gmane.org/gmane.linux.highavailability.user/32290).
So after changing the RA to ocf:pa
Hi Mike,
thank you for the advice.
Referring to my actual knowledge the resource stickiness just defines that a
resource should remain to the node it is running on.
The failover and failback actions are performed correctly with the location
contraints that bind the resources to a specific node wh
Simon,
I'm new to this, so if this doesn't help, don't despair - the more
experienced members will be along shortly. :-)
Could it be that you need "stickyness?"
I think that's the term for the concept you are describing.
Also, if that's a two node cluster, have you defined cluster property
no-q
With the help of "crm_mon -f" I found out that just one node has the right
pingd score.
Migration summary:
* Node node1: pingd=3
* Node node2: pingd=0
When I intiate a failover by setting node1 to standby the pingd score on
node2 becomes 3.
A ping that is run manually arrives the destina
Hi,
I'm trying to set up location contraints for my cluster, but I don't get
them to work in the way that I want.
The constraints should implement the following behaviour:
- Normal operation
msDRBD0 and resIP0 start on node1, msDRBD1 and resIP1 start on node2
- Loss of network connection
When the
On Fri, Sep 17, 2010 at 2:38 AM, jiaju liu wrote:
> Clone Set: pingd_data_net
> Started: [ oss3 oss2 oss1 ]
>
> I use the command :
>
> crm_resource -g host_list -r pingd_data_net
> to check the param host_list
>
what does the resource definition look like?
>
> the result
Clone Set: pingd_data_net
Started: [ oss3 oss2 oss1 ]
I use the command :
crm_resource -g host_list -r pingd_data_net
to check the param host_list
the result is
pingd_data_net is active on more than one node, returning the default value
for
Er
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote:
> On Friday, September 03, 2010, Lars Ellenberg wrote:
> > > > how about an fping RA ?
> > > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
> > > >
> > > > terminates in about 3 seconds for a hostlist of 100 (
On Fri, Sep 3, 2010 at 12:12 PM, Bernd Schubert
wrote:
> On Friday, September 03, 2010, Lars Ellenberg wrote:
>> > > how about an fping RA ?
>> > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
>> > >
>> > > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 2
On Fri, Sep 3, 2010 at 9:38 AM, Lars Ellenberg
wrote:
> On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote:
>> On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg
>> wrote:
>> > On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
>> >> On Thursday, September 02, 2010, Andrew Be
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote:
> > > >> PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as
> > > >> replacement. We simply cannot fulfill n/2 + 1, as controller failure
> > > >> takes down 50% of the systems (virtual machines) and the systems
> >
On Friday, September 03, 2010, Lars Ellenberg wrote:
> > > how about an fping RA ?
> > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
> > >
> > > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
> > > which are alive).
> >
> > Happy to add if someone
On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote:
> On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg
> wrote:
> > On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
> >> On Thursday, September 02, 2010, Andrew Beekhof wrote:
> >> > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Sc
On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg
wrote:
> On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
>> On Thursday, September 02, 2010, Andrew Beekhof wrote:
>> > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
>> > > My proposal is to rip out all network code out of pingd and
On Thursday, September 02, 2010, Lars Ellenberg wrote:
> On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
> > On Thursday, September 02, 2010, Andrew Beekhof wrote:
> > > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
> > >
> > > > My proposal is to rip out all network code out of
On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
> On Thursday, September 02, 2010, Andrew Beekhof wrote:
> > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
> > > My proposal is to rip out all network code out of pingd and to add
> > > slightly modified files from 'iputils'.
> >
>
On Thursday, September 02, 2010, Andrew Beekhof wrote:
> On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
> > My proposal is to rip out all network code out of pingd and to add
> > slightly modified files from 'iputils'.
>
> Close, but thats not portable.
> Instead use ocf:pacemaker:ping which goes
On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
wrote:
> Andrew,
>
> I think pindg is rather broken:
>
> strace -f /usr/lib64/heartbeat/pingd -a pingdnet2 -d 5s -i 1 -n 2 -h
> 10.0.1.16
>
> (which is in fact localhost)
>
> In an endless loop:
>
> sendmsg(4, {msg_name(16)={sa_family=AF_INET, sin_
Andrew,
I think pindg is rather broken:
strace -f /usr/lib64/heartbeat/pingd -a pingdnet2 -d 5s -i 1 -n 2 -h
10.0.1.16
(which is in fact localhost)
In an endless loop:
sendmsg(4, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("10.0.1.16")}, msg_iov(1)=[{"\10\0\324\
On Tue, 2010-06-08 at 19:08 +0200, Dejan Muhamedagic wrote:
> Not sure, but I think that the default for the attribute name is
> "pingd". Try changing L3_ping to pingd in the constraints.
Dejan,
thanks a lot for pointing out error, I really appreciate it.
I've changed the attribute name to 'pin
Hi,
On Tue, Jun 08, 2010 at 06:43:11PM +0200, Dalibor Dukic wrote:
> On Sat, 2010-06-05 at 15:36 +0200, Dalibor Dukic wrote:
> > I have problem with ping RA not correctly updating CIB with appropriate
> > attributes when doing fresh start. So afterwards IPaddr2 resources wont
> > start.
>
> Have
On Sat, 2010-06-05 at 15:36 +0200, Dalibor Dukic wrote:
> I have problem with ping RA not correctly updating CIB with appropriate
> attributes when doing fresh start. So afterwards IPaddr2 resources wont
> start.
Have anyone had chance to get a peek at this?
My setup consists from two nodes doi
Hi,
I have problem with ping RA not correctly updating CIB with appropriate
attributes when doing fresh start. So afterwards IPaddr2 resources wont
start.
I'm running Ubuntu 10.04 with pacemaker 1.0.8+hg15494-2ubuntu2 .
Have seen people on list having same problems. I've corrected integer
based
I have the same problem as Quentin Smith (sticky pingd value="0").
[12:49:44 ha1] ~> rpm -qa | egrep -i "pacemaker|corosync|heartbeat|resource"
heartbeat-libs-3.0.2-2.el5
heartbeat-3.0.2-2.el5
pacemaker-libs-1.0.8-1.el5
resource-agents-1.0.1-1.el5
corosync-1.2.0-1.el5
pacemaker-1.0.8-1.el5
coros
On Sat, Mar 13, 2010 at 1:26 AM, Quentin Smith wrote:
> I don't know a lot about hg, but doesn't the "r15404" in the version of
> pacemaker that I'm running now mean that I already have this bugfix
> (r15295)?
Yes.
I'd suggest you use the ping RA instead of pingd.
The ping RA uses your system's
I don't know a lot about hg, but doesn't the "r15404" in the version of
pacemaker that I'm running now mean that I already have this bugfix
(r15295)?
--Quentin
On Fri, 12 Mar 2010, hj lee wrote:
Hi,
This seems the same problem I reported a while ago. It was fixed in
http://hg.clusterlabs.o
Hi,
This seems the same problem I reported a while ago. It was fixed in
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/214f0fc258f2.
Thanks
On Fri, Mar 12, 2010 at 2:36 PM, Quentin Smith wrote:
> Hi-
>
> I just took the latest updates to pacemaker and heartbeat from
> http://people.debian.
Hi-
I just took the latest updates to pacemaker and heartbeat from
http://people.debian.org/~madkiss/ha. In particular, I upgraded
heartbeat 1:3.0.2-1~bpo50+1 to 1:3.0.2+hg12547-2~bpo50+1
pacemaker 1.0.7+hg20100203-1~bpo50+1 to 1.0.7+hg20100303r15404-3~bpo50+1
cluster-agents 1:1.0.2-1~bpo50+1
On 06/04/2009 09:33 AM, Andrew Beekhof wrote:
>> Do we even still have "configured ping nodes" in the original, ha.cf sense?
>
> For openais based clusters, no.
> For heartbeat based ones, yes.
I see.
>>
>>
>> The name of the attributes to set. This is the name to be used in the
>> constraint
On Thu, Jun 4, 2009 at 8:47 AM, Florian Haas wrote:
> Andrew, Dejan, Dominik,
>
> I am by no means a pingd expert, but the current incarnation in
> stable-1.0 seems to have some outdated and misleading comments and meta
> data. Examples:
>
>
>
> The list of ping nodes to count. Defaults to all
Andrew, Dejan, Dominik,
I am by no means a pingd expert, but the current incarnation in
stable-1.0 seems to have some outdated and misleading comments and meta
data. Examples:
The list of ping nodes to count. Defaults to all configured ping nodes.
Rarely needs to be specified.
Host list
D
On Tue, May 26, 2009 at 3:22 PM, Eliot Gable wrote:
> I am using 1.0.3, but the failure-timeout thing does not seem to work for
> pingd.
>
You'll have to show us the rest of your configuration
___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
h
...@beekhof.net]
Sent: Monday, May 25, 2009 11:49 AM
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] PingD Failure-Timeout
On Thu, May 21, 2009 at 10:20 PM, Eliot Gable wrote:
> Is there a way to time-out the failure of PingD?
Yes, but you need version >= 1.0.0
I assume you're not runn
On Thu, May 21, 2009 at 10:20 PM, Eliot Gable wrote:
> Is there a way to time-out the failure of PingD?
Yes, but you need version >= 1.0.0
I assume you're not running it as a clone right?
>
>
>
> In my configuration, I cannot run PingD all the time on every node. Only one
> node (the master) has
tended recipient, please call me
immediately. BROADVOX is a registered trademark of Broadvox, LLC.
From: Eliot Gable [mailto:ega...@broadvox.net]
Sent: Thursday, May 21, 2009 4:20 PM
To: pacemaker@oss.clusterlabs.org
Subject: [Pacemaker] PingD Failure-Timeout
Is there a way to time-out the fail
Is there a way to time-out the failure of PingD?
In my configuration, I cannot run PingD all the time on every node. Only one
node (the master) has public Internet access. I use PingD to cause the master
to fail-over to one of the slaves. When a slave becomes master, it then gains
public Intern
Excellent news.
The slowdown was probably related to the memory leak I fixed for 1.0.3
Let me know if you have any further problems
On Tue, May 12, 2009 at 7:42 PM, Stelio Plautz wrote:
> Am Tue, 12 May 2009 11:19:46 +0200
> schrieb Andrew Beekhof :
>
> Hi Andrew,
> I've upgraded both nodes to 1
Am Tue, 12 May 2009 11:19:46 +0200
schrieb Andrew Beekhof :
Hi Andrew,
I've upgraded both nodes to 1.0.3 and it looks ok now.
Thanks stelio
> On Mon, May 11, 2009 at 12:01 PM, Stelio Plautz
> wrote:
> > Hi all,
> >
> > I've set up a 2 node cluster on debian etch amd64, pacemaker 1.0.2-1
> > an
On Mon, May 11, 2009 at 12:01 PM, Stelio Plautz wrote:
> Hi all,
>
> I've set up a 2 node cluster on debian etch amd64, pacemaker 1.0.2-1
> and heartbeat 2.99.1-1 from suse repository.
> everything works fine, but pingd increases CPU usage slowly. I've two
> pingd processes running and both use ab
Hi all,
I've set up a 2 node cluster on debian etch amd64, pacemaker 1.0.2-1
and heartbeat 2.99.1-1 from suse repository.
everything works fine, but pingd increases CPU usage slowly. I've two
pingd processes running and both use about 100 % CPU after 3 weeks.
14091 root 16 0 1299M 1263M
49 matches
Mail list logo