Re: [Pacemaker] [Problem]colocation condition does not become effective.

2012-06-27 Thread renayama19661014
Hi Andrew, Thank you for comment. All right. We wait for a correction. Best Regards, Hideo Yamauchi. --- On Thu, 2012/6/28, Andrew Beekhof wrote: > On Tue, Jun 12, 2012 at 3:33 PM,  wrote: > > Hi All, > ... > >  I registered these contents with Bugzilla. > >  * http://bugs.clusterlabs.org/sh

Re: [Pacemaker] [Problem and Question] About negative setting of colocation.

2012-06-27 Thread renayama19661014
Hi Andrew, All right. We wait for a correction. Best Regards, Hideo Yamauchi. --- On Thu, 2012/6/28, Andrew Beekhof wrote: > On Wed, Jun 27, 2012 at 4:27 PM,  wrote: > > Hi All, > > > > I registered these contents with Bugzilla. > >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5074 > > Ex

Re: [Pacemaker] Cluster issues - all resources restarting when a node reboots

2012-06-27 Thread Andrew Beekhof
On Thu, Jun 28, 2012 at 4:03 AM, Velayutham, Prakash wrote: > Hello all, > > Below is the relevant part of my CIB. I am facing 2 issues. > > 1. Every time a node in the cluster reboots, all resources get restarted in > the entire cluster. > 2. I have a time-based rule for stickiness, but it does

Re: [Pacemaker] Multiple split-brain problem

2012-06-27 Thread Andrew Beekhof
On Wed, Jun 27, 2012 at 8:04 PM, coma wrote: > Thank for your reply Andreas, > > My fisrt node is a virtual machine (active node), the second (passive node) > is physical standalone server, there is no high load on any of them but the > problem seems to come from the virtual server. > I actually h

Re: [Pacemaker] [Problem and Question] About negative setting of colocation.

2012-06-27 Thread Andrew Beekhof
On Wed, Jun 27, 2012 at 4:27 PM, wrote: > Hi All, > > I registered these contents with Bugzilla. >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5074 Excellent, thankyou! David and I have almost finished our "break everything and put it back together" phase, we'll be diving back into bug report

[Pacemaker] FYI: Unified error codes and backwards compatibility

2012-06-27 Thread Andrew Beekhof
One of the things on my todo list for 1.2 is to unify the various sets of custom error codes in pacemaker into a single set of codes with a single error-to-text function. LSB and OCF return codes are the obvious exception here. Where possible I will be using standard system error codes from errno.

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-06-27 Thread Andrew Beekhof
On Thu, Jun 28, 2012 at 1:29 PM, Andrew Beekhof wrote: > On Wed, Jun 27, 2012 at 11:30 PM, Brian J. Murrell > wrote: >> On 12-06-26 09:54 PM, Andrew Beekhof wrote: >>> >>> The DC, possibly you didn't have one at that moment in time. >> >> It was the DC in fact.  I restarted corosync on that node

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-06-27 Thread Andrew Beekhof
On Wed, Jun 27, 2012 at 11:30 PM, Brian J. Murrell wrote: > On 12-06-26 09:54 PM, Andrew Beekhof wrote: >> >> The DC, possibly you didn't have one at that moment in time. > > It was the DC in fact.  I restarted corosync on that node and the > timeouts went away.  But note I "re"started, not starte

Re: [Pacemaker] [Problem]colocation condition does not become effective.

2012-06-27 Thread Andrew Beekhof
On Tue, Jun 12, 2012 at 3:33 PM, wrote: > Hi All, ... >  I registered these contents with Bugzilla. >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5070 Excellent. I believe David has been looking into this already. ___ Pacemaker mailing list: Pacem

Re: [Pacemaker] Centos 6.2 and cman + corosync (pacemaker cluster) for GFS2

2012-06-27 Thread Andrew Beekhof
On Tue, Jun 26, 2012 at 10:24 PM, Maurits van de Lande wrote: > Hello, > > > > I have been testing a classic pacemaker+corosync cluster for virtualization > but it was lacking a cluster filesystem to host the virtual  machine config > files. (I was using the guidelines I found at Linbit to setup a

Re: [Pacemaker] Restart clone service on only one node

2012-06-27 Thread Andrew Beekhof
On Mon, Jun 25, 2012 at 9:51 PM, David Coulson wrote: > I've a couple of cloned resources which need to be restarted one at a time > as part of a batch process. > > If I do a 'crm -w resource restart cl-whatever', it restarts the whole lot > at once. I can do a 'service appname stop' on each box,

Re: [Pacemaker] A processor joined or left the membership and a new membership was formed.

2012-06-27 Thread Andrew Beekhof
On Sun, Jun 24, 2012 at 10:55 PM, Hugo Deprez wrote: > Hello, > > I guess this as laready been raised but I do have the following > message on some of my clusters : > > Jun 24 12:00:02 server corosync[27089]:   [TOTEM ] A processor joined > or left the membership and a new membership was formed. >

Re: [Pacemaker] problem with resource group

2012-06-27 Thread Andrew Beekhof
On Thu, Jun 21, 2012 at 6:17 PM, Luca Meron wrote: > >> I don't mean in the cluster config, I mean actual processes. > > no, for sure not. you say 'no', but the next sentence indicates that the answer is 'yes'. > but for these resource I have problems because they're detected as too > active. v

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Velayutham, Prakash
Hi Jake, On Jun 27, 2012, at 4:42 PM, Jake Smith wrote: > - Original Message - >> From: "Prakash Velayutham" >> To: "The Pacemaker cluster resource manager" >> Sent: Wednesday, June 27, 2012 4:06:39 PM >> Subject: Re: [Pacemaker] Time-based resource stickiness not working cleanly >> >>

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Jake Smith
- Original Message - > From: "Prakash Velayutham" > To: "The Pacemaker cluster resource manager" > Sent: Wednesday, June 27, 2012 4:06:39 PM > Subject: Re: [Pacemaker] Time-based resource stickiness not working cleanly > > I created a simple IPaddr2 resource for this testing. > > 1. I

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Velayutham, Prakash
I created a simple IPaddr2 resource for this testing. 1. I can confirm that this resource (with just a location preference constraint and nothing else) does not get restarted when a node reboots. So must be something with the clone/group resources. Not sure how to debug right now. 2. I have a ti

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Velayutham, Prakash
I am all for testing, but looks like our database person wants this completed now. I will test this in our dev. environment soon. Thanks, Prakash On Jun 27, 2012, at 2:44 PM, Phil Frost wrote: > On 06/27/2012 02:33 PM, Velayutham, Prakash wrote: >> and the cluster works fine, except that when t

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Phil Frost
On 06/27/2012 02:33 PM, Velayutham, Prakash wrote: and the cluster works fine, except that when the fenced (STONITHed) node comes back up and joins the cluster, all resources (including the one that is running in its preferred location) gets restarted. This is annoying and I am trying to find

[Pacemaker] If you want High Availability on OpenStack, check out Heat! (details inside)

2012-06-27 Thread Steven Dake
As some may know, Angus and I were working previously on a project called pacemaker-cloud, with the intention of adding high availbility to guests in cloud environments. We stopped developing that project in March 2012 and took our experiences to a new project called Heat. For more details of why

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Velayutham, Prakash
Yeah, after reading that doc is why I went with the original config that I had. I am having to run 2 MySQL instances each on its own OCFS2 volume. The volumes are mounted on both the nodes, but I have location preference so 1 instance runs on each node if possible. I am able to test all scenario

Re: [Pacemaker] Time-based resource stickiness not working cleanly

2012-06-27 Thread Phil Frost
On 06/26/2012 04:33 PM, Velayutham, Prakash wrote: Any idea? Can a resource order constraint be specified depending on a primitive that is part of a clone resource? Is that even supported? Probably not. Usually you'd want to have your constraints reference the clone, not the primitive behind i

[Pacemaker] Cluster issues - all resources restarting when a node reboots

2012-06-27 Thread Velayutham, Prakash
Hello all, Below is the relevant part of my CIB. I am facing 2 issues. 1. Every time a node in the cluster reboots, all resources get restarted in the entire cluster. 2. I have a time-based rule for stickiness, but it does not work. Can any one point out what is wrong with the configuration?

Re: [Pacemaker] MS-Resource never gets promoted

2012-06-27 Thread Phil Frost
On 06/27/2012 12:38 PM, Stallmann, Andreas wrote: I let the tomcat script write some quite elaborate debug output, which NEVER shows an attempt to promote the resource. Any ideas? Does your RA call crm_master? Otherwise you will have to include location constraints in your configuration statin

[Pacemaker] MS-Resource never gets promoted

2012-06-27 Thread Stallmann, Andreas
Hi again, I’m still working on my tomcat master/slave resource-script. Well... the script works fine, as far as ocf-tester can be believed. start / stop / status / monitor / promote and demote are implemented and seem to work, when called from ocf-tester. Now I integradet the resource in crm l

Re: [Pacemaker] Behavior of booth when the fail-over in nodes and in sites is caused at the same time

2012-06-27 Thread Jiaju Zhang
On Wed, 2012-06-27 at 22:14 +0900, Yuichi SEINO wrote: > Hi Jiaju, > > I several times tested the failure of node by this structure. > This case is probably caused when the node having a ticket was > fail-over in two nodes. > I hope to early be resolve this problem. OK, I'm going to look into thi

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-06-27 Thread Brian J. Murrell
On 12-06-26 09:54 PM, Andrew Beekhof wrote: > > The DC, possibly you didn't have one at that moment in time. It was the DC in fact. I restarted corosync on that node and the timeouts went away. But note I "re"started, not started. It was running at the time, just not properly, apparently. > W

Re: [Pacemaker] Behavior of booth when the fail-over in nodes and in sites is caused at the same time

2012-06-27 Thread Yuichi SEINO
Hi Jiaju, I several times tested the failure of node by this structure. This case is probably caused when the node having a ticket was fail-over in two nodes. I hope to early be resolve this problem. Sincerely, Yuichi 2012/6/21 Jiaju Zhang : > On Thu, 2012-06-21 at 16:40 +0900, Yuichi SEINO wrot

Re: [Pacemaker] Multiple split-brain problem

2012-06-27 Thread coma
Thank for your reply Andreas, My fisrt node is a virtual machine (active node), the second (passive node) is physical standalone server, there is no high load on any of them but the problem seems to come from the virtual server. I actually have the same problem of split brain when I take or delet

Re: [Pacemaker] Different Corosync Rings for Different Nodes in Same Cluster?

2012-06-27 Thread Dan Frincu
Hi, On Tue, Jun 26, 2012 at 9:53 PM, Andrew Martin wrote: > Hello, > > I am setting up a 3 node cluster with Corosync + Pacemaker on Ubuntu 12.04 > server. Two of the nodes are "real" nodes, while the 3rd is in standby mode > as a quorum node. The two "real" nodes each have two NICs, one that is

Re: [Pacemaker] Multiple split-brain problem

2012-06-27 Thread coma
Thank for the link emmanuel, it seems to be a solution for my problem, i will test it! 2012/6/26 emmanuel segura > Look here > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch09s03s03.html > > :-) > > 2012/6/26 coma > >> Hello, >> >> i running on a 2 node cluste