Re: [Pacemaker] /.crm_help_index file (in system root aka /)

2010-07-13 Thread Maros Timko
> Date: Tue, 13 Jul 2010 13:42:00 +0200 > From: Lars Ellenberg > To: pacemaker@oss.clusterlabs.org > Subject: Re: [Pacemaker] /.crm_help_index file (in system root aka /) > Message-ID: <20100713114200.gc4...@barkeeper1-xen.linbit> > Content-Type: text/plain; charset=iso-8859-1 > > On Tue, Jul 13,

Re: [Pacemaker] RFC: stonith-enabled="error-recovery"

2010-06-25 Thread Maros Timko
> Date: Thu, 24 Jun 2010 17:46:39 +0200 > From: Lars Marowsky-Bree > To: pacemaker@oss.clusterlabs.org > Subject: [Pacemaker] RFC: stonith-enabled="error-recovery" > Message-ID: <20100624154639.gf5...@suse.de> > Content-Type: text/plain; charset=iso-8859-1 > > Hi, > > this is about a new setting f

Re: [Pacemaker] Pacemaker Digest, Vol 31, Issue 56

2010-06-18 Thread Maros Timko
> Despite your clocks being a bit out, "dampen" looks to be doing what > its supposed to... OK, vsp7 is more than 1 second ahead, still not that bad [r...@vsp7 ~]# date +%H:%M:%S.%N; ssh 135.64.30.29 date +%H:%M:%S.%N 10:07:15.093943000 10:07:13.924642000 > > Jun 17 15:13:27 vsp7 attrd_updater: [3

Re: [Pacemaker] crm node delete

2010-06-15 Thread Maros Timko
> On Fri, Jun 11, 2010 at 03:45:19PM +0100, Maros Timko wrote: >> Hi all, >> >> using heartbeat stack. I have a system with one node offline: >> >> Last updated: Fri Jun 11 13:52:40 2010 >> Stack: Heartbeat >> Current DC: vsp7.exam

Re: [Pacemaker] How to really deal with gateway restarts?

2010-06-15 Thread Maros Timko
I thought "dampen" attribute could help with some of the options, but actually it is does not. >>> >>> It should do. ?Hard to say without any logs from the two machines. >> >> Unfort. I don't have log files here, can provide you if that would help. >> Are you sure dampen should help here?

Re: [Pacemaker] How to really deal with gateway restarts?

2010-06-14 Thread Maros Timko
n Thu, Jun 10, 2010 at 9:22 PM, Maros Timko wrote: >> Hi all, >> >> I know it was requested here number of times, but with no real >> conclusive answer. All of the requests were update Pacemaker and use >> ping RA. >> >> Setup: >> ?- simple symetric 2 no

[Pacemaker] crm node delete

2010-06-11 Thread Maros Timko
Hi all, using heartbeat stack. I have a system with one node offline: Last updated: Fri Jun 11 13:52:40 2010 Stack: Heartbeat Current DC: vsp7.example.com (ba6d6332-71dd-465b-a030-227bcd31a25f) - partition with quorum Version: 1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782 2 Nod

[Pacemaker] How to really deal with gateway restarts?

2010-06-10 Thread Maros Timko
Hi all, I know it was requested here number of times, but with no real conclusive answer. All of the requests were update Pacemaker and use ping RA. Setup: - simple symetric 2 node DRBD-Xen cluster - both nodes connected to the same network and gateway - cloned ping RA to monitor gateway and u

Re: [Pacemaker] master/slave or unique clones

2010-05-28 Thread Maros Timko
> So, application acts as "master" if it was able to bind to the pre-configured > IP and as a "node" if it wasn't. If it's a master it listens on an additional > port and receives updates from nodes. Each application pulls video feed out > of attached video cameras and stores them on the local d

[Pacemaker] showscores.sh script (patch?)

2010-03-25 Thread Maros Timko
Hi all, mainly Dominik Klein - thanks for the showscores.sh script. But I am curious, why the script filters numeric values only for failcount values meaning INFINITY are just ignored: 1. Simulating a failure Failed actions: dom0-fs-Dom0_start_0 (node=vsp11.example.com, call=15, rc=5, status=c

Re: [Pacemaker] DRBD Recovery Policies

2010-03-11 Thread Maros Timko
> What would happen in this scenario? Would the RA defer the promote until > the sync is completed? Would the inability to promote cause the failback > to not happen and a resource cleanup is required once the sync has > completed? > > > > I guess this is really down to how advanced the Linbit DRBD

[Pacemaker] ping RA stop patch

2010-02-04 Thread Maros Timko
Hi all, I think pacemaker's ping RA should be modified in the following way (patch attached):  * it does not make a lot of sense to call ping/update in stop method just before the attribute is deleted * timeouts changed accordingly: - stop method should be faster now, decreased to 20s - m

Re: [Pacemaker] Bug in "ping" pacemaker resource agent

2010-01-28 Thread Maros Timko
Sorry, last minute changes delivered a small issue into the patch in my previous email. Please find updated patch. Tino 2010/1/28 Maros Timko > Unfortunately, this is not the only bug of this RA. > 1. There is absolute mess regarding the status file - some part of >

Re: [Pacemaker] Bug in "ping" pacemaker resource agent

2010-01-28 Thread Maros Timko
Hi all, this one worked for me. I also added debug parameter defaulted to false to prevent logging on every attr update. Tino 2010/1/28 Maros Timko : > Unfortunately, this is not the only bug of this RA. > 1. There is absolute mess regarding the status file - some part of >

Re: [Pacemaker] Bug in "ping" pacemaker resource agent

2010-01-28 Thread Maros Timko
Unfortunately, this is not the only bug of this RA. 1. There is absolute mess regarding the status file - some part of code uses OCF_RESKEY_pidfile, some uses $OCF_RESKEY_state. It means that the status handling does not work and you will end up with following failure: Jan 28 16:45:08 vsp7 lrmd: [1

Re: [Pacemaker] crm: timeout for start warning (possible bug?)

2010-01-27 Thread Maros Timko
>> seems like it is a bit more complicated then I initially though. Now I >> tried to set the timeout longer so that there will not be any warning. >> However, the second group is not created neither: >> # crm >> crm(live)# configure >> crm(live)configure# group udom udom-drbd-udom0 udom-drbd-udom1

Re: [Pacemaker] crm: timeout for start warning (possible bug?)

2010-01-26 Thread Maros Timko
_attrs[attname].value TypeError: unsubscriptable object # crm configure group udom udom-drbd-udom0 udom-drbd-udom1 udom-drbd-udom2 udom-delay udom-delay udom-vm meta is-managed=false # echo $? 0 I will create a new bug report. Tino 2010/1/26 Maros Timko : >> > >> > I want to c

Re: [Pacemaker] crm: timeout for start warning (possible bug?)

2010-01-26 Thread Maros Timko
> > > > I want to create a new group from primitives that already exist: > > crm(live)configure# group udom udom-drbd-udom0 udom-drbd-udom1 > > udom-drbd-udom2 udom-delay udom-delay udom-vm meta is-managed=false > > crm(live)configure# commit > > WARNING: dom0-fs-Domain00: timeout 20s for start is

[Pacemaker] crm: timeout for start warning (possible bug?)

2010-01-26 Thread Maros Timko
Hi all, I think the start timeout check has indirectly introduced some bug. I have already a group with Filesystem primitive in it: group dom0 dom0-drbd-Domain00 dom0-fs-Domain00 dom0-vsppreconsole \ meta is-managed="true" I want to create a new group from primitives that already exist:

[Pacemaker] Pacemaker+DRBD deployment scenario

2010-01-21 Thread Maros Timko
Hi all, I am using this list only as I know LINBIT guys are active here too :o) Recent EPEL version of Pacemaker now also includes couple of DRBD rpms, but not DRBD kernel module itself. I am just curious how should I install the Pacemaker+DRBD . - Once I install DRBD I will get all the utils, RA

[Pacemaker] STONITH request failed

2009-11-12 Thread Maros Timko
Hi all, has anybody experienced following. We were simulating the failure of a resource stop with STONITH enabled. The node that failed to stop the resource requested to stonith itself. However, the request was not processed by peer node. It has not logged anything into log file at that time. Inte