Re: [Pacemaker] stonith

2015-04-19 Thread Andreas Kurz
On 2015-04-17 12:36, Thomas Manninger wrote: > Hi list, > > i have a pacemaker/corosync2 setup with 4 nodes, stonith configured over > ipmi interface. > > My problem is, that sometimes, a wrong node is stonithed. > As example: > I have 4 servers: node1, node2, node3, node4 > > I start a hardw

Re: [Pacemaker] Problem with function ocf_local_nodename

2013-12-12 Thread Andreas Kurz
On 2013-12-12 17:53, Michael Böhm wrote: > Hi @all, > > i am new to pacemaker and currently setting up a test-environment for > future production-use. Unfortunately i ran into a problem with using the > mysql resource agent and i'm hoping someone here can enlighten me. > > Server-Distro is Debian

Re: [Pacemaker] restart resources in clone mode on a single node

2013-12-12 Thread Andreas Kurz
On 2013-12-12 18:06, ESWAR RAO wrote: > Hi All, > > Can someone please help me in restarting all resources of clone on > single node. > > On 3 node setup with HB+pacemaker. > I have configured all 3 resources in clone mode with max as 2 to start > only on node1 and node2. > ++

Re: [Pacemaker] Colocation constraint to External Managed Resource

2013-10-10 Thread Andreas Kurz
On 2013-10-10 18:20, Robert H. wrote: > Hello, > > Am 10.10.2013 16:18, schrieb Andreas Kurz: > >> You configured a monitor operation for this unmanaged resource? > > Yes, and some parts work as expected, however some behaviour is strange. > &

Re: [Pacemaker] Colocation constraint to External Managed Resource

2013-10-10 Thread Andreas Kurz
On 2013-10-09 18:33, Robert H. wrote: > Hello list, > > I have a question regarding colocation. > > I have an external managed resource (not part of pacemaker, but running > on the pacemaker nodes as multi master application) - in this case > XtraDB Cluster. I also want to keep this ressource man

Re: [Pacemaker] IPaddr2 between eth1 e bond1

2013-10-06 Thread Andreas Kurz
On 2013-10-05 02:04, Charles Mean wrote: > Hello guys, > > I have a cluster with 2 nginx sharing one VIP: > > primitive VIP_AD_SRV ocf:heartbeat:IPaddr2 params ip="X.Y.Z.W" > cidr_netmask="30" nic="eth1" op monitor interval="1s" > > > The problem is that I have replaced on of those two

Re: [Pacemaker] Corosync won't recover when a node fails

2013-10-04 Thread Andreas Kurz
m-debug-origin="do_update_resource" > crm_feature_set="3.0.6" > transition-key="10:14:7:1b4a3ae4-b013-45d1-a865-9b3b3deecf5f" > transition-magic="0:7;10:14:7:1b4a3ae4-b013-45d1-a865-9b3b3deecf5f" > call-id="4" rc-code="7" op-status="0&quo

Re: [Pacemaker] Corosync won't recover when a node fails

2013-10-03 Thread Andreas Kurz
On 2013-10-03 22:12, David Parker wrote: > Thanks, Andrew. The goal was to use either Pacemaker and Corosync 1.x > from the Debain packages, or use both compiled from source. So, with > the compiled version, I was hoping to avoid CMAN. However, it seems the > packaged version of Pacemaker doesn'

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-18 Thread Andreas Kurz
w > > Best regards. > > Francis > On 09/18/2013 03:54 PM, Andreas Kurz wrote: >> On 2013-09-18 15:44, Francis SOUYRI wrote: >>> Hello Andreas, >>> >>> I do not see what is wrong in my config. >> >> You have no "monitor" operation

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-18 Thread Andreas Kurz
. > > Francis > > On 09/18/2013 04:26 PM, Francis SOUYRI wrote: >> Hello, >> >> I take an example from Internet without monitor... Do you have a >> suggestion ? >> >> Best regards. >> >> Francis >> On 09/18/2013 03:54 PM, An

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-18 Thread Andreas Kurz
ot;/> > > type="IPaddr2"> > > value="192.168.1.249"/> > name="cidr_netmask" value="24"/> > > > name="monitor" timeout="5s"/> > > >

Re: [Pacemaker] Howto test/simulate the reaction of the cluster to node up and down

2013-09-18 Thread Andreas Kurz
On 2013-09-18 15:08, Andreas Mock wrote: > Hi all, > > really nobody here with deeper experience of crm_simulate? > Or with a hint for good documentation? What Pacemaker version are you using? I did a quick test here on older 1.1.6 and 1.1.7 clusters and they show a nice output on "crm_simulate -

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-18 Thread Andreas Kurz
100 noeud1.apec.fr 1000 > > Best regards. > > Francis > > On 09/17/2013 10:21 AM, Andreas Kurz wrote: >> On 2013-09-17 09:45, Francis SOUYRI wrote: >>> Hello, >>> >>> Some help about my problem ? >>> >>> I have

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-17 Thread Andreas Kurz
On 2013-09-17 09:45, Francis SOUYRI wrote: > Hello, > > Some help about my problem ? > > I have a corosync/pacemaker with 2 nodes and 2 nets by nodes, > 192.168.1.0/24 for cluster access, 10.1.1.0/24 for drbd in bond, both > used by corosync. > I try to used ocf:pacemaker:ping to monitor the 192.

Re: [Pacemaker] mysql ocf resource agent - resource stays unmanaged if binary unavailable

2013-05-17 Thread Andreas Kurz
On 2013-05-17 00:24, Vladimir wrote: > Hi, > > our pacemaker setup provides mysql resource using ocf resource agent. > Today I tested with my colleagues forcing mysql resource to fail. I > don't understand the following behaviour. When I remove the mysqld_safe > binary (which path is specified in

Re: [Pacemaker] Stonith: How to avoid deathmatch cluster partitioning

2013-05-17 Thread Andreas Kurz
On 2013-05-16 11:01, Lars Marowsky-Bree wrote: > On 2013-05-15T22:55:43, Andreas Kurz wrote: > >> start-delay is an option of the monitor operation ... in fact means >> "don't trust that start was successfull, wait for the initial monitor >> some more time&

Re: [Pacemaker] Stonith: How to avoid deathmatch cluster partitioning

2013-05-17 Thread Andreas Kurz
On 2013-05-16 11:31, Klaus Darilion wrote: > Hi Andreas! > > On 15.05.2013 22:55, Andreas Kurz wrote: >> On 2013-05-15 15:34, Klaus Darilion wrote: >>> On 15.05.2013 14:51, Digimer wrote: >>>> On 05/15/2013 08:37 AM, Klaus Darilion wrote: >>&g

Re: [Pacemaker] pacemaker colocation after one node is down

2013-05-16 Thread Andreas Kurz
On 2013-05-16 13:42, Wolfgang Routschka wrote: > Hi Andreas, > > thank you for your answer. > > solutions is one coloation with -score ah, yes only _one_ of them with a non-negative value is needed. Scores of all constraints are added up. Regards, Andreas > > colocation cl_g_ip-address_n

Re: [Pacemaker] Loss of ocf:pacemaker:ping target forces resources to restart?

2013-05-15 Thread Andreas Kurz
On 2013-05-15 20:44, Andrew Widdersheim wrote: > Sorry to bring up old issues but I am having the exact same problem as the > original poster. A simultaneous disconnect on my two node cluster causes the > resources to start to transition to the other node but mid flight the > transition is abort

Re: [Pacemaker] pacemaker colocation after one node is down

2013-05-15 Thread Andreas Kurz
On 2013-05-15 21:30, Wolfgang Routschka wrote: > Hi everybody, > > one question today about colocation rule on a 2-node cluster on > scientific linux 6.4 and pacemaker/cman. > > 2-Node Cluster > > first node haproxy load balancer proxy service - second node with > postfix service. > > coloc

Re: [Pacemaker] Stonith: How to avoid deathmatch cluster partitioning

2013-05-15 Thread Andreas Kurz
On 2013-05-15 15:34, Klaus Darilion wrote: > On 15.05.2013 14:51, Digimer wrote: >> On 05/15/2013 08:37 AM, Klaus Darilion wrote: >>> primitive st-pace1 stonith:external/xen0 \ >>> params hostlist="pace1" dom0="xentest1" \ >>> op start start-delay="15s" interval="0" >> >> Try; >>

Re: [Pacemaker] help with DRBD/pacemaker

2013-05-13 Thread Andreas Kurz
On 2013-05-13 11:54, Michael Schwartzkopff wrote: > Hi, > > > > I experienced a strange phenomena when setting up DRBD 8.4.2 with > pacemaker 1.1.9. > > > > I can set up a dual primary DRBD manually without any problems. Now shut > down one node, i.e. I demote the DRBD and the shut is "down

Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-04-01 Thread Andreas Kurz
Hi Dejan, On 2013-03-06 11:59, Dejan Muhamedagic wrote: > Hi Hideo-san, > > On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp wrote: >> Hi Dejan, >> Hi Andrew, >> >> As for the crm shell, the check of the meta attribute was revised with the >> next patch. >> >> * http://hg.sa

Re: [Pacemaker] pacemaker node stuck offline

2013-03-25 Thread Andreas Kurz
On 2013-03-22 03:39, pacema...@feystorm.net wrote: > > On 03/21/2013 11:15 AM, Andreas Kurz wrote: >> On 2013-03-21 14:31, Patrick Hemmer wrote: >>> I've got a 2-node cluster where it seems last night one of the nodes >>> went offline, and I can't see any r

Re: [Pacemaker] issues when installing on pxe booted environment

2013-03-25 Thread Andreas Kurz
On 2013-03-22 19:31, John White wrote: > Hello Folks, > We're trying to get a corosync/pacemaker instance going on a 4 node > cluster that boots via pxe. There have been a number of state/file system > issues, but those appear to be *mostly* taken care of thus far. We're > running into a

Re: [Pacemaker] Resource is Too Active (on both nodes)

2013-03-25 Thread Andreas Kurz
On 2013-03-22 21:35, Mohica Jasha wrote: > Hey, > > I have two cluster nodes. > > I have a service process which is prone to crash and takes a very long > time to start. > Since the service process takes a long time to start I have the service > process running on both nodes, but only the active

Re: [Pacemaker] OCF Resource agent promote question

2013-03-25 Thread Andreas Kurz
Hi Steve, On 2013-03-25 18:44, Steven Bambling wrote: > All, > > I'm trying to work on a OCF resource agent that uses postgresql > streaming replication. I'm running into a few issues that I hope might > be answered or at least some pointers given to steer me in the right > direction. Why are y

Re: [Pacemaker] pacemaker node stuck offline

2013-03-21 Thread Andreas Kurz
On 2013-03-21 14:31, Patrick Hemmer wrote: > I've got a 2-node cluster where it seems last night one of the nodes > went offline, and I can't see any reason why. > > Attached are the logs from the 2 nodes (the relevant timeframe seems to > be 2013-03-21 between 06:05 and 06:10). > This is on ubunt

Re: [Pacemaker] ping resource polling skew

2013-03-20 Thread Andreas Kurz
On 2013-03-20 04:11, Quentin Smith wrote: > On Wed, 20 Mar 2013, Andreas Kurz wrote: > >> On 2013-03-19 17:02, Quentin Smith wrote: >>> Hi- >>> >>> I have my cluster configured to use a cloned ping resource, such that I >>> can write a constraint t

Re: [Pacemaker] ping resource polling skew

2013-03-19 Thread Andreas Kurz
On 2013-03-19 17:02, Quentin Smith wrote: > Hi- > > I have my cluster configured to use a cloned ping resource, such that I > can write a constraint that I prefer resources to run on a node that has > network connectivity. That works fine if a machine loses its network > connection (the ping attri

Re: [Pacemaker] Trouble with DRBD mount

2013-02-28 Thread Andreas Kurz
On 2013-02-28 13:19, senrab...@aol.com wrote: > Hi All: > > We are stuck trying to get pacemaker to work with DRBD, and having tried > various alternatives can't get our "drbd1" to mount and get some errors. > > NOTE: we are trying to get pacemaker to work with an existing Encrypted > RAID1 LVM

Re: [Pacemaker] crm in RHEL 6.4 ... where are you?

2013-02-22 Thread Andreas Kurz
On 2013-02-21 23:16, Bob Haxo wrote: > Greetings, > > Anyone know where "crm" is in RHEL 6.4, or in the most recent set of > RHEL 6.3 updates? crm is not included in the latest pacemaker-cli > package: pacemaker-cli-1.1.8-7.el6.x86_64.rpm It is here: http://download.opensuse.org/repositories/ne

Re: [Pacemaker] Monitor process, migrate only ip resources

2013-02-19 Thread Andreas Kurz
On 2013-02-19 13:54, Grant Bagdasarian wrote: > Hello, > > > > I wish to monitor a certain running process and migrate floating IP > addresses when this process stops running. > > > > My current configuration is as following: > > crm(live)configure# show > > node $id="8fe81814-6e85-454f-b

Re: [Pacemaker] return properties and rsc_defaults back to default values

2013-02-14 Thread Andreas Kurz
Hi Brian, On 2013-02-14 16:48, Brian J. Murrell wrote: > Is there a way to return an individual property (or all properties) > and/or a rsc_default (or all) back to default values, using crm, or > otherwise? You mean beside deleting it? Cheers, Andreas -- Need help with Pacemaker? http://www.h

Re: [Pacemaker] stonith on node-add

2013-01-30 Thread Andreas Kurz
On 2013-01-30 20:51, Matthew O'Connor wrote: > Hi! I must be doing something stupidly wrong... every time I add a new > node to my live cluster, the first thing the cluster decides to do is > STONITH the node, and despite any precautions I take (other than > flat-out disabling STONITH during the

Re: [Pacemaker] Best way to recover from failed STONITH?

2012-12-21 Thread Andreas Kurz
On 12/21/2012 07:47 PM, Andrew Martin wrote: > Andreas, > > Thanks for the help. Please see my replies inline below. > > - Original Message - >> From: "Andreas Kurz" >> To: pacemaker@oss.clusterlabs.org >> Sent: Friday, December 21, 2012 10:11

Re: [Pacemaker] Best way to recover from failed STONITH?

2012-12-21 Thread Andreas Kurz
On 12/21/2012 04:18 PM, Andrew Martin wrote: > Hello, > > Yesterday a power failure took out one of the nodes and its STONITH device > (they share an upstream power source) in a 3-node active/passive cluster > (Corosync 2.1.0, Pacemaker 1.1.8). After logging into the cluster, I saw that > the S

Re: [Pacemaker] reloading crm changes

2012-12-17 Thread Andreas Kurz
ITO, WFO Juneau > NOAA, National Weather Service > > > > > On Mon, Dec 17, 2012 at 11:22 PM, Andreas Kurz <mailto:andr...@hastexo.com>> wrote: > > On 12/17/2012 11:29 PM, Paul Shannon - NOAA Federal wrote: > > I'm just getting our clust

Re: [Pacemaker] reloading crm changes

2012-12-17 Thread Andreas Kurz
On 12/17/2012 11:29 PM, Paul Shannon - NOAA Federal wrote: > I'm just getting our cluster set up and seem to be missing something > about changes made using the crm program. I added some resources and > groups using crm => configure => edit. After saving and committing my > changes I can see the n

Re: [Pacemaker] Constraint on location of resource start

2012-11-09 Thread Andreas Kurz
On 11/09/2012 08:49 AM, Cserbák Márton wrote: > Hi, > > I have successfully set up a DRBD+Pacemaker+Xen cluster on two Debian > servers. Unfortunately, I am facing the same issue as the one > described in https://bugzilla.redhat.com/show_bug.cgi?id=694492, > namely, that the CPU features differ on

Re: [Pacemaker] MySQL and PostgreSQL on same node with DRBD and floating IPs: suggestions wanted

2012-10-27 Thread Andreas Kurz
On 10/26/2012 02:00 PM, Denny Schierz wrote: > hi, > > I'm playing with Pacemaker from Debian squeeze-backports to get failover > running, for PostgreSQL and MySQL(mariaDB) on the same node with two DRBD > resources and two floating VIPs. It seems to be working, the failover works, > but I want

Re: [Pacemaker] Behavior of Corosync+Pacemaker with DRBD primary power loss

2012-10-25 Thread Andreas Kurz
On 10/24/2012 04:03 PM, Andrew Martin wrote: > Hi Andreas, > > - Original Message - >> From: "Andreas Kurz" >> To: pacemaker@oss.clusterlabs.org >> Sent: Wednesday, October 24, 2012 4:13:03 AM >> Subject: Re: [Pacemaker] Behavior of Cor

Re: [Pacemaker] setup of a cluster - principal questions

2012-10-24 Thread Andreas Kurz
On 10/24/2012 11:47 AM, Lentes, Bernd wrote: > Hi, > > i'd like to establish a HA Cluster with two nodes. I will use SLES 11 SP2 + > HAE. > I have a shared storage, it's a FC SAN. My services will run in vm's, one vm > for one service. The vm's will run using KVM. > First i thought to install th

Re: [Pacemaker] Behavior of Corosync+Pacemaker with DRBD primary power loss

2012-10-24 Thread Andreas Kurz
On 10/23/2012 05:04 PM, Andrew Martin wrote: > Hello, > > Under the Clusters from Scratch documentation, allow-two-primaries is > set in the DRBD configuration for an active/passive cluster: > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html-single/Clusters_from_Scratch/index.html#_wr

Re: [Pacemaker] "Simple" LVM/drbd backed Primary/Secondary NFS cluster doesn't always failover cleanly

2012-10-20 Thread Andreas Kurz
On 10/18/2012 08:02 PM, Justin Pasher wrote: > I have a pretty basic setup by most people's standards, but there must > be something that is not quite right about it. Sometimes when I force a > resource failover from one server to the other, the clients with the NFS > mounts don't cleanly migrate t

Re: [Pacemaker] resource doesn't migrate after failcount is reached

2012-10-20 Thread Andreas Kurz
On 10/20/2012 12:53 PM, emmanuel segura wrote: > Hello List > > I have a stand alone resource and one group, i would like that when the > stand alone resource reaches the failcount, the group doesn't migrate > and the stand alone stays on the node where the group is situated Then don't set a migr

Re: [Pacemaker] high cib load on config change

2012-10-09 Thread Andreas Kurz
On 10/09/2012 01:42 PM, James Harper wrote: > As per previous post, I'm seeing very high cib load whenever I make a > configuration change, enough load that things timeout seemingly instantly. I > thought this was happening well before the configured timeout but now I'm not > so sure, maybe the

Re: [Pacemaker] Resource agent IPaddr2 failed to start

2012-10-09 Thread Andreas Kurz
On 10/09/2012 11:17 AM, Soni Maula Harriz wrote: > > > On Tue, Oct 9, 2012 at 4:01 PM, Andreas Kurz <mailto:andr...@hastexo.com>> wrote: > > On 10/09/2012 10:39 AM, Soni Maula Harriz wrote: > > Dear all, > > > > I'm a newbie in

Re: [Pacemaker] Resource agent IPaddr2 failed to start

2012-10-09 Thread Andreas Kurz
On 10/09/2012 10:39 AM, Soni Maula Harriz wrote: > Dear all, > > I'm a newbie in clustering. I have been following the 'Cluster from > scratch' tutorial. > I use Centos 6.3 and install pacemaker and corosync from : yum install > pacemaker corosync > > This is the version i got > Pacemaker 1.1.7-6

Re: [Pacemaker] Active/Active Clustering using GFS2 in Centos-6.2

2012-10-04 Thread Andreas Kurz
On 09/19/2012 10:51 AM, ecfgijn wrote: > Hi All , > > I have configured active/active using pacemaker in centos-6.2 along with > gfs2. Below are configuration. > > crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params > ip="169.144.106.121" cidr_netmask="26" op monitor interval=30s > crm

Re: [Pacemaker] bug in monitor timeout?

2012-10-04 Thread Andreas Kurz
On 10/04/2012 12:18 PM, James Harper wrote: >> Hi, >> >> On Wed, Oct 03, 2012 at 10:07:06PM +, James Harper wrote: >>> It seems like everytime I modify a resource, things start timing out. Just >>> now I changed the location of where a ping resource could run and this >>> happened: >>> Oct 4 0

Re: [Pacemaker] recovering from split brain

2012-10-04 Thread Andreas Kurz
On 10/04/2012 12:03 AM, Jane Du (jadu) wrote: > Re-send. Will be appreciate if someone could shed some light on this. > > -Original Message- > From: Jane Du (jadu) > Sent: Friday, September 28, 2012 4:53 PM > To: The Pacemaker cluster resource manager > Subject: [Pacemaker] recovering fro

Re: [Pacemaker] apache on too many nodes?

2012-10-04 Thread Andreas Kurz
On 10/03/2012 10:47 PM, Jake Smith wrote: > > > > - Original Message - >> From: mar...@nic.fi >> To: pacemaker@oss.clusterlabs.org >> Sent: Wednesday, October 3, 2012 4:36:10 PM >> Subject: [Pacemaker] apache on too many nodes? >> >> Hello, >> >> I'm currently testing out a 2 node system

Re: [Pacemaker] staggered startup

2012-09-06 Thread Andreas Kurz
On 09/05/2012 04:08 PM, James Harper wrote: > A power failure tonight indicated that my clustered resources (xen vm's) have > a dependency requirement like "make sure at least one domain controller VM is > fully up and running before starting any other windows servers". Determining > a status of

Re: [Pacemaker] Did I miss a dependency or is this a 1.1.7 bug?

2012-08-01 Thread Andreas Kurz
On 08/01/2012 06:51 AM, mark - pacemaker list wrote: > Hello, > > I suspect I've missed a dependency somewhere in the build process, and > I'm hoping someone recognizes this as an easy fix. I've basically > followed the build guide on ClusterLabs in 'Clusters from Scratch v2', > the build from so

Re: [Pacemaker] /var/lib/pengine folder consuming space over time

2012-07-26 Thread Andreas Kurz
ndreas > > ---- > *From:* Andreas Kurz > *To:* pacemaker@oss.clusterlabs.org > *Sent:* Friday, 20 July 2012 5:57 PM > *Subject:* Re: [Pacemaker] /var/lib/pengine folder consuming space over time > > On 07/20/2012 02:18 PM, ihjaz Mohamed wrote: >> Hi A

Re: [Pacemaker] Specifying a location rule with a resource:role

2012-07-25 Thread Andreas Kurz
On 07/24/2012 09:03 PM, Jay Janssen wrote: > Hi all, > Simple crm configure syntax question (I think). I'm trying to make a > given role (a 'Master' in a master/slave set) from being location on a > given node. However, the "obvious" syntax (at least in my mind) doesn't > work: > > location av

Re: [Pacemaker] delaying stonith

2012-07-24 Thread Andreas Kurz
On 07/24/2012 11:02 AM, Frank Van Damme wrote: > 2012/7/23 Andreas Kurz : >> You are using stonith-aciton="poweroff" with external/ipmi? That would >> explain this 4s during which the System does a powerdown caught by >> acpid. Use stonith-aciton="reset" wh

Re: [Pacemaker] delaying stonith

2012-07-23 Thread Andreas Kurz
On 07/23/2012 10:24 AM, Frank Van Damme wrote: > Hello, > > I had an interesting conversation on irc about clustering and fencing, > and I was told it is possible to delay the triggering of a stonith > action by a number of seconds. I searched, but I can't really find how > to configure it. > > (

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-23 Thread Andreas Kurz
On 07/23/2012 07:06 AM, David Barchas wrote: > Hello. > > I have been working on this for 3 days now, and must be so stressed out > that I am being blinded to what is probably an obvious cause of this. In > a word, HELP. > > I am trying specifically to utilize ocf:heartbeat:IPaddr2, but this > is

Re: [Pacemaker] /var/lib/pengine folder consuming space over time

2012-07-20 Thread Andreas Kurz
On 07/20/2012 02:18 PM, ihjaz Mohamed wrote: > Hi All, > > I see that the folder /var/lib/pengine is consuming space over time. > > Is there a configuration to limit the size of the data logged by pengine > so that once it reaches this limit the older ones get removed. the cluster properties to

Re: [Pacemaker] mysql resource agent

2012-07-19 Thread Andreas Kurz
On 07/18/2012 03:24 PM, DENNY, MICHAEL wrote: > Our current monitor action tests the availability of the mysql database. > However, the monitor fails if mysql is doing recovery processing. And the > recovery processing can take a long time. Do you know if there is a way to > programmatical

Re: [Pacemaker] Pacemaker 1.1.7 order constraint syntax

2012-07-19 Thread Andreas Kurz
On 07/19/2012 02:57 PM, Rasto Levrinc wrote: > On Thu, Jul 19, 2012 at 2:38 PM, Andreas Kurz wrote: >> On 07/19/2012 11:47 AM, Vadym Chepkov wrote: >>> Hi, >>> >>> When Pacemaker 1.1.7 was announced, a new feature was mentioned: >>> >>> Th

Re: [Pacemaker] Pacemaker 1.1.7 order constraint syntax

2012-07-19 Thread Andreas Kurz
On 07/19/2012 11:47 AM, Vadym Chepkov wrote: > Hi, > > When Pacemaker 1.1.7 was announced, a new feature was mentioned: > > The ability to specify that A starts after ( B or C or D ) > > I wasn't able to find an example how to express it crm shell in neither man > crm nor in Pacemaker Explained

Re: [Pacemaker] drbd under pacemaker - always get split brain

2012-07-11 Thread Andreas Kurz
our problem. Regards, Andreas > >>> >>> Best Regards, >>> Andreas >>> >>> -- >>> Need help with Pacemaker? >>> http://www.hastexo.com/now >>> >>>> >>>> thanks for Your time. >>>> n. >>>> >>>

Re: [Pacemaker] drbd under pacemaker - always get split brain

2012-07-11 Thread Andreas Kurz
On 07/11/2012 04:50 AM, Andrew Beekhof wrote: > On Wed, Jul 11, 2012 at 8:06 AM, Andreas Kurz wrote: >> On Tue, Jul 10, 2012 at 8:12 AM, Nikola Ciprich >> wrote: >>> Hello Andreas, >>>> Why not using the RA that comes with the resource-agent package? >>

Re: [Pacemaker] drbd under pacemaker - always get split brain

2012-07-10 Thread Andreas Kurz
- >> Need help with Pacemaker? >> http://www.hastexo.com/now >> >> > >> > thanks a lot in advance >> > >> > nik >> > >> > >> > On Sun, Jul 08, 2012 at 12:47:16AM +0200, Andreas Kurz wrote: >> >> On 07/02/2

Re: [Pacemaker] Pacemaker cannot start the failed master as a new slave?

2012-07-09 Thread Andreas Kurz
On 07/09/2012 06:11 AM, quanta wrote: > Related thread: > http://oss.clusterlabs.org/pipermail/pacemaker/2011-December/012499.html > > I'm going to setup failover for MySQL replication (1 master and 1 slave) > follow this guide: > https://github.com/jayjanssen/Percona-Pacemaker-Resource-Agents/blo

Re: [Pacemaker] drbd under pacemaker - always get split brain

2012-07-09 Thread Andreas Kurz
gt; > disk { > on-io-error detach; > no-disk-barrier; > no-disk-flushes; > no-md-flushes; > } > > startup { > # wfc-timeout 0; > degr-wfc-timeou

Re: [Pacemaker] drbd under pacemaker - always get split brain

2012-07-07 Thread Andreas Kurz
On 07/02/2012 11:49 PM, Nikola Ciprich wrote: > hello, > > I'm trying to solve quite mysterious problem here.. > I've got new cluster with bunch of SAS disks for testing purposes. > I've configured DRBDs (in primary/primary configuration) > > when I start drbd using drbdadm, it get's up nicely (b

Re: [Pacemaker] Confusing semantics of colocation sets

2012-07-07 Thread Andreas Kurz
On 07/02/2012 08:28 PM, Phil Frost wrote: > On 07/02/2012 12:50 PM, Dejan Muhamedagic wrote: >> What is being mangled actually? The crm shell does what is >> possible given the pacemaker RNG schema. It is unfortunate that >> the design is slightly off, but that cannot be fixed in the crm >> syntax.

Re: [Pacemaker] Pacemaker hang with hardware reset

2012-07-07 Thread Andreas Kurz
On 07/04/2012 01:20 PM, Damiano Scaramuzza wrote: > Hi Emmanuel, > yes I use drbd level fence as in linbit user guide > > disk { > fencing resource-only; > ... In a dual-primary setup, use "resource-and-stonith" > } > handlers { > fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; us

Re: [Pacemaker] 2-node cluster doesn't move resources away from a failed node

2012-07-07 Thread Andreas Kurz
On 07/05/2012 04:12 PM, David Guyot wrote: > Hello, everybody. > > As the title suggests, I'm configuring a 2-node cluster but I've got a > strange issue here : when I put a node in standby mode, using "crm node > standby", its resources are correctly moved to the second node, and stay > there eve

Re: [Pacemaker] Problem setting-up DRBD v8,4 with Pacemaker v1.1.6

2012-07-04 Thread Andreas Kurz
On 07/04/2012 09:16 PM, Irfan Ali wrote: > Hi all, > > We are trying to set-up an HA pair on RHEL 6.2 using DRBD (v > 8.4.1-2), Pacemaker (v 1.1.6-3) and Corosync (v 1.4.1-4). We could > make DRBD work independently syncing the two machines of the pair. But > our problem begins when we try to con

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-04 Thread Andreas Kurz
On 07/04/2012 12:36 AM, Brian J. Murrell wrote: > On 12-07-03 06:17 PM, Andrew Beekhof wrote: >> >> Even adding passive nodes multiplies the number of probe operations >> that need to be performed and loaded into the cib. > > So it seems. I just would have not thought they be such a load since >

Re: [Pacemaker] Centos 6.2 corosync errors after reboot prevent joining

2012-07-03 Thread Andreas Kurz
On 07/02/2012 06:47 PM, Martin de Koning wrote: > Hi all, > > Reasonably new to pacemaker and having some issues with corosync loading > the pacemaker plugin after a reboot of the node. It looks like similar > issues have been posted before but I haven't found a relavent fix. > > The Centos 6.2 n

Re: [Pacemaker] Multiple split-brain problem

2012-06-26 Thread Andreas Kurz
On 06/26/2012 03:49 PM, coma wrote: > Hello, > > i running on a 2 node cluster with corosync & drbd in active/passive > mode for mysql hight availablity. > > The cluster working fine (failover/failback & replication ok), i have no > network outage (network is monitored and i've not seen any failu

Re: [Pacemaker] "Grouping" of master/slave resources

2012-06-25 Thread Andreas Kurz
On 06/25/2012 06:14 PM, Stallmann, Andreas wrote: > Hi! > > > > We are currently switching from mysql-on-drbd with tomcat and a shared > IP to mysql-master-slave with tomcat-master-slave and shared IP and an > additonal cloned service (activemq). > > > > Up until now we had quite an easy se

Re: [Pacemaker] Two slave nodes, neither will promote to Master

2012-06-25 Thread Andreas Kurz
On 06/25/2012 05:48 PM, Regendoerp, Achim wrote: > Hi, > > I'm currently looking at two VMs which are supposed to mount a drive in > a given directory, depending on who's the master. This was decided above > me, therefore no DRBD stuff (which would've made things easier), but > still using corosyn

Re: [Pacemaker] Collocating resource with a started clone instance

2012-06-22 Thread Andreas Kurz
On 06/22/2012 12:40 PM, Sergey Tachenov wrote: >>> group postgres pgdrive_fs DBIP postgresql >>> colocation postgres_on_drbd inf: postgres ms_drbd_pgdrive:Master >>> order postgres_after_drbd inf: ms_drbd_pgdrive:promote postgres:start >>> ... >>> location DBIPcheck DBIP \ >>>rule $id="DBIP

Re: [Pacemaker] Collocating resource with a started clone instance

2012-06-22 Thread Andreas Kurz
On 06/22/2012 11:58 AM, Sergey Tachenov wrote: > Hi! > > I'm trying to set up a 2-node cluster. I'm new to pacemaker, but > things are getting better and better. However, I am completely at a > loss here. > > I have a cloned tomcat resource, which runs on both nodes and doesn't > really depend on

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-22 Thread Andreas Kurz
cemaker DLM dæmon, I strongly > thinks these messages are related. Strangely, even if I have this dæmon > executable in /usr/sbin, it's not loaded by Pacemaker : > root@Vindemiatrix:/home/david# ls /usr/sbin/dlm_controld.pcmk > /usr/sbin/dlm_controld.pcmk > root@Vindemiatri

Re: [Pacemaker] resources not migrating when some are not runnable on one node, maybe because of groups or master/slave clones?

2012-06-22 Thread Andreas Kurz
On 06/21/2012 11:30 PM, David Vossel wrote: > - Original Message - >> From: "Phil Frost" >> To: pacemaker@oss.clusterlabs.org >> Sent: Tuesday, June 19, 2012 4:25:53 PM >> Subject: Re: [Pacemaker] resources not migrating when some are not runnable >> on one node, maybe because of groups o

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
> > And, last but not least, I run Debian Squeeze 3.2.13-grsec--grs-ipv6-64. > > Thank you in advance. > > Kind regards. > > PS: if you find me a bit rude, please accept my apologies; I'm working > on it for weeks following the official DRBD guide and it

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
gration-threshold=100 > + (4) probe: rc=5 (not installed) >p_drbd_ocfs2_pgsql:1: migration-threshold=100 > + (6) probe: rc=8 (master) >p_drbd_ocfs2_backupvi:1: migration-threshold=100 > + (7) probe: rc=8 (master) >p_drbd_ocfs2_svn:1: migration-t

Re: [Pacemaker] "ERROR: Wrong stack o2cb" when trying to start o2cb service in Pacemaker cluster

2012-06-20 Thread Andreas Kurz
On 06/20/2012 01:43 PM, David Guyot wrote: > Hello, everybody. > > I'm trying to configure Pacemaker for using DRBD + OCFS2 storage, but > I'm stuck with DRBD and controld up and o2cb doggedly displaying "not > installed" errors. To do this, I followed the DRBD guide ( > http://www.drbd.org/users-

Re: [Pacemaker] [solved] stopping resource stops others in colocation / order sets

2012-06-18 Thread Andreas Kurz
On 06/15/2012 06:19 PM, Phil Frost wrote: > On 06/15/2012 11:55 AM, David Vossel wrote: >>> If resC is stopped >>> resource stop resC >>> >>> then drbd_nfsexports is demoted, and resB and resC will stop. Why is >>> that? I'd expect that resC, being listed last in both the colocation >>> and >>

Re: [Pacemaker] resources not migrating when some are not runnable on one node, maybe because of groups or master/slave clones?

2012-06-18 Thread Andreas Kurz
On 06/18/2012 04:14 PM, Vladislav Bogdanov wrote: > 18.06.2012 16:39, Phil Frost wrote: >> I'm attempting to configure an NFS cluster, and I've observed that under >> some failure conditions, resources that depend on a failed resource >> simply stop, and no migration to another node is attempted, e

Re: [Pacemaker] pacemaker gfs

2012-06-13 Thread Andreas Kurz
On 06/13/2012 05:38 AM, hcyy wrote: > thank you for your reply! don't forget to post to the list >> Failed actions: >> >> dlm:1_monitor_0 (node=pcmk-2, call=4, rc=5, status= >> >> complete): not installed >> >> dlm:0_monitor_0 (node=pcmk-1, call=4, rc=5, status= >> >> complete): not i

Re: [Pacemaker] Pacemaker-GFS2-DLM

2012-06-12 Thread Andreas Kurz
On 06/12/2012 04:34 AM, 燕阳 蔡 wrote: > Hello, > > > > I'm trying to install gfs2:apt-get install gfs2-utils gfs-pcmk,when i > add dlm,it show: you installed the dlm-pcmk package? Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > Failed actions: > > dlm:1_mon

Re: [Pacemaker] Design question. Service running everywhere

2012-06-12 Thread Andreas Kurz
On 06/12/2012 10:17 PM, Arturo Borrero Gonzalez wrote: > Hi there! > > I've some questions for you. > > I'm deploying a new cluster with a service inside that doesn't matter. > The important of the service is that is running in everywhere by it's > own internal functionality. > So start/stop the

Re: [Pacemaker] How to change Stack

2012-06-01 Thread Andreas Kurz
On 06/01/2012 01:05 PM, Mars gu wrote: > hi, >My cluster: > corosync-1.4.1-4.el6_2.2.x86_64 >pacemaker-1.1.6-3.el6.x86_64 use corosync-2 and pacemaker 1.1.7 Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > >I want to use votequorum as a qourum provider

Re: [Pacemaker] Tomcat and "unmanaged failed"

2012-05-30 Thread Andreas Kurz
On 05/30/2012 03:41 PM, Stallmann, Andreas wrote: > Hi! > >> Yes, that is default behaviour ... Pacemaker tries to stop, that fails so it >> must assume (worst case) it is still running, now STONITH would trigger to >> make sure the node including the resource is definitely down ... without >>

Re: [Pacemaker] pacemaker with corosync on Fedora

2012-05-30 Thread Andreas Kurz
On 05/30/2012 07:07 PM, Lutz Griesbach wrote: > Hi there, > > > im trying to setup a cluster on fedora17 > > corosync-2.0.0-1.fc17.i686 > pacemaker-1.1.7-2.fc17.i686 > > as i understand pacemaker packages are built without heartbeat > > [root@lgr-fed17-1 ~]# pacemakerd --features > Pacemaker

Re: [Pacemaker] Tomcat and "unmanaged failed"

2012-05-30 Thread Andreas Kurz
Hi Andreas, On 05/29/2012 04:14 PM, Stallmann, Andreas wrote: > Hi there, > > > > we have here a corosync/pacemaker cluster running tomcat. Sometimes our > application running inside tomcat fails and tomcat dies. > > > > This – for some reason I don’t understand – leads to an “unmanaged >

Re: [Pacemaker] CentOS pacemaker heartbeat

2012-04-30 Thread Andreas Kurz
On 04/30/2012 05:37 PM, fatcha...@gmx.de wrote: > Hi, > > I´ve just installed a CentOS 6.2 and also installed via epel-repo > heartbeat-3.0.4-1.el6.x86_64 and > pacemaker-1.1.6-3.el6.x86_64. > I try to start heartbeat (crm respawn in ha.cf) and I get this error: > > crmd: [2462]: CRIT: get_clus

Re: [Pacemaker] LVM restarts after SLES upgrade

2012-04-26 Thread Andreas Kurz
On 04/25/2012 11:00 AM, Frank Meier wrote: > Am 24.04.2012 17:53, schrieb pacemaker-requ...@oss.clusterlabs.org: > >> Message: 2 >> Date: Tue, 24 Apr 2012 15:58:53 + >> From: "Daugherity, Andrew W" >> To: "" >> Subject: Re: [Pacemaker] LVM restarts after SLES upgrade >> Message-ID: <114ad516

Re: [Pacemaker] Order constrain with resouce group

2012-04-23 Thread Andreas Kurz
On 04/23/2012 01:12 PM, emmanuel segura wrote: > Hello List > > I would like to know if it's possible make one order constrain with a > resource group yes, you can also reference a group in a constraint. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > For esample

Re: [Pacemaker] Corosync / Pacemaker Cluster crashing

2012-04-20 Thread Andreas Kurz
On 04/20/2012 12:08 PM, Bensch, Kobus wrote: > Hi > > I have the following cluster setup: > > 2 physical Dell servers with RHEL6.2 with all the latest patches. > > Each server has 3 network connections that looks like this: > > BOND02 NIC's > > ETH4 for Corosync > ETH6 for corosync > > This i

Re: [Pacemaker] OCF Resource agent monitor activity failed due to temporary error

2012-04-19 Thread Andreas Kurz
manager do this work. There may be cases were a "degraded" resource state may be a nice feature and is already a topic here on the list ... from time to time. Regards, Andreas > Christian > > -Original Message- > From: Andreas Kurz [mailto:andr...@hastexo.com] > S

  1   2   3   >