Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 2:21 PM, Vladislav Bogdanov wrote: > 30.07.2012 02:39, Andrew Beekhof wrote: >> On Tue, Jul 24, 2012 at 2:25 PM, Vladislav Bogdanov >> wrote: >>> 24.07.2012 04:50, Andrew Beekhof wrote: On Tue, Jul 24, 2012 at 5:38 AM, David Barchas wrote: > > On Monday, July

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-29 Thread Vladislav Bogdanov
30.07.2012 02:39, Andrew Beekhof wrote: > On Tue, Jul 24, 2012 at 2:25 PM, Vladislav Bogdanov > wrote: >> 24.07.2012 04:50, Andrew Beekhof wrote: >>> On Tue, Jul 24, 2012 at 5:38 AM, David Barchas wrote: On Monday, July 23, 2012 at 7:48 AM, David Barchas wrote: Date: Mon,

Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-29 Thread renayama19661014
Hi Andrew, Thank you or commets. > > Online: [ drbd1 drbd2 ] > > > > Master/Slave Set: msDrPostgreSQLDB > > Masters: [ drbd2 ] > > Slaves: [ drbd1 ] ---> Started and Status > > Slave. > > Yep, looks like a bug. I'll follow up on the bugzilla. I talked wi

Re: [Pacemaker] can't get pacemaker started

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 27, 2012 at 9:31 AM, Dave Jiang wrote: > Hi. I'm following the cluster from scratch guide to create a simple > active/passive 2 node cluster. I'm using the standard packages that come > with Fedora 17. I have corosync running and linked up. However I cannot > seem to get Pacemaker to r

Re: [Pacemaker] pacemaker & LIO

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 27, 2012 at 12:17 AM, wrote: > dear members of the pacemaker mailing list, > > i am using pacemaker on Debian GNU/Linux testing (wheezy) in combination with > LIO [1] and DRBD. The setup heavily relies on [2]. > After fiddling around I am able to move the iscsi storage from one > (s

Re: [Pacemaker] node offline after fencing (pacemakerd hangs)

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 20, 2012 at 1:51 AM, Raoul Bhatia [IPAX] wrote: > On 2012-07-19 16:05, Jake Smith wrote: >> Another solution is something like (will vary a little in RHEL I believe): >> >> Disable corosync autostart >> $sudo update-rc.d -f corosync disable S >> >> add 'post-up /etc/init.d/corosync sta

Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 23, 2012 at 9:43 AM, wrote: > Hi David, > > Thank you for comments. > >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html > > I confirmed it in INFINITY. > > (snip) >rsc-role="Master" score="INFINITY" with-rsc="clnPingd"/> >sy

Re: [Pacemaker] How to notify one resource about changes in another resource?

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 13, 2012 at 5:09 PM, Slava Astashonok wrote: > Hello, > > I wonder is it possible to notify one resource about changes in another > resource state. Suppose, I have two different services A and B. They run on > several nodes of cluster. I'd like to inform B about changes in A state. For

Re: [Pacemaker] resources not migrating when some are not runnable on one node, maybe because of groups or master/slave clones?

2012-07-29 Thread Andrew Beekhof
On Sat, Jun 30, 2012 at 1:59 AM, Phil Frost wrote: > On 06/28/2012 01:29 PM, David Vossel wrote: >> >> I've been looking into multistate resource colocations quite a bit this >> week. I have a branch I'm working with that may improve this situation for >> you. >> >> If you are feeling brave, test

Re: [Pacemaker] Live migration question.

2012-07-29 Thread Andrew Beekhof
On Thu, Jul 12, 2012 at 2:26 PM, David Pendell wrote: > I have two cluster nodes that have a gigabit network between them for doing > live migrations of running kvm VMs. If one of the two hosts go off line, > naturally all of the guests then get restarted on the other host. But when > the offline

Re: [Pacemaker] After reboot, node does not an automatically rejoin

2012-07-29 Thread Andrew Beekhof
On Thu, Jul 19, 2012 at 7:11 PM, Tom Tux wrote: > Hi > > When I reboot one of our two-node-cluster-boxes (sles11 sp1, fully > patched, HAE installed, the node does not rejoin himself to the > cluster. I got the following error: > > corosync[5377]: [pcmk ] WARN: route_ais_message: Sending message

Re: [Pacemaker] Orphaned resources

2012-07-29 Thread Andrew Beekhof
On Thu, Jul 12, 2012 at 12:48 AM, Thilo Uttendorfer wrote: > Hi, > > on several pacemaker clusters I sometimes see ORPHANED (DRBD) resources. In > some cases they exist only for a short time and are automatically removed. > But in other cases the resource will fail. "crm status" then looks like th

Re: [Pacemaker] Question about crm_attribute log messages every 20 seconds

2012-07-29 Thread Andrew Beekhof
On Wed, Jul 4, 2012 at 4:36 PM, Maurits van de Lande wrote: >>What Pacemaker version? > > 1.1.6.3.el6 x86_64 And these messages were from the system logfile? Or somewhere else? Because my reading of the 1.1.6 source code suggests that these messages would only go to logfiles and not syslog. > >

Re: [Pacemaker] 3 node cluster - two nodes get fenced/rebooted when one dies?

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 6, 2012 at 8:25 AM, Errol Neal wrote: > Hi again. I was hoping to get some insight into why two nodes get rebooted in > my cluster when I halt one of of them. > > I'm running corosync 1.1.4 and pacemaker-1.1.6 on CentOS 6.2. I've put my > configuration up on pastebin if anyone would

Re: [Pacemaker] [Question]View order of the node of crm_mon

2012-07-29 Thread nozawat
Hi Andrew, I'm sorry. I intended to mention the following patches. May this phenomenon think that a different correction is applied? Regards, Tomo 2012/7/30 Andrew Beekhof : > On Mon, Jul 30, 2012 at 11:24 AM, wrote: >>

Re: [Pacemaker] [Question]View order of the node of crm_mon

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 11:24 AM, wrote: > Hi Andrew, > >> Can you talk to Mori-san and make sure its on his list? > I carry out regression-test for PM-1.0 affected by of the patch. > BTW, are you good by a different method in PM-1.1 that it was revised? > I was not able to find the patch history

Re: [Pacemaker] [Question]View order of the node of crm_mon

2012-07-29 Thread nozawatm
Hi Andrew, > Can you talk to Mori-san and make sure its on his list? I carry out regression-test for PM-1.0 affected by of the patch. BTW, are you good by a different method in PM-1.1 that it was revised? I was not able to find the patch history of this email in PM-1.1. Regards, Tomo On Mon, 30

Re: [Pacemaker] corosync init script doesn't invoke cib?

2012-07-29 Thread Andrew Beekhof
On Sun, Jul 22, 2012 at 2:10 AM, quanta wrote: > Distro: CentOS > corosync-1.4.3-1 > pacemaker-1.0.12-1.el5.centos > > On one node when I starts the corosync, it doesn't invoke cib and attrd: I assure you it does. However if they are not running anymore there must have been a problem. Perhaps loo

Re: [Pacemaker] Could not establish cib_rw connection

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 23, 2012 at 10:23 PM, Martin Unger wrote: > I would be grateful for any other ideas... > Don't make me use haresources ;-) Its hard to say based on only a few lines of logs. Perhaps if you included some from the cib (ie. the thing that no-one can connect to) we could say more. __

Re: [Pacemaker] delaying stonith

2012-07-29 Thread Andrew Beekhof
On Wed, Jul 25, 2012 at 8:39 PM, Frank Van Damme wrote: > 2012/7/24 Lars Marowsky-Bree : >>> Doesn't that delay the startup of the stonith device resource in >>> pacemaker, rather dan adding a number of seconds between the decision >>> to stonith and the actual stonithing? >> >> The effect is the

Re: [Pacemaker] Pengine behavior

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 20, 2012 at 6:39 PM, Виталий Давудов wrote: > Hi, David! > > Yes, you are right, I'm trying to do active call failover. I hope to achieve > 3 secs silence during the call (now it's 5 secs). If there is any kind of > directive in corosync to monitor the node more aggressively (every 1 s

Re: [Pacemaker] resume monitor operation (bug 5063 + 5072)

2012-07-29 Thread Andrew Beekhof
On Tue, Jul 24, 2012 at 8:43 PM, Thilo Uttendorfer wrote: > Hi all, > > I ran into bug 5063 and/or 5072 while setting the cluster in maintenance mode > and back ("crm configure property maintenance-mode=true" and later "false"). > > I'm running pacemaker 1.1.6 (Martin's backport for Ubuntu lucid).

Re: [Pacemaker] [Question]View order of the node of crm_mon

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 23, 2012 at 11:58 AM, wrote: > Hi Andrew > > Is the patch of this email taken in? > It is now: https://github.com/beekhof/pacemaker/commit/a1dcce2 > I want you to apply a patch if there is not a problem. Ca

Re: [Pacemaker] Reduce log level of retrying messages from pingd

2012-07-29 Thread Andrew Beekhof
On Tue, Jul 24, 2012 at 2:57 PM, Junko IKEDA wrote: > Hi, > > I set up two pingd RAs on the same nodes using Pacemaker 1.0.12. > Each pingd has the other destination. > > property \ > no-quorum-policy="ignore" \ > stonith-enabled="false" \ > startup-fencing="false" \ >

Re: [Pacemaker] problem with pacemaker/corosync on CentOS 6.3

2012-07-29 Thread Andrew Beekhof
On Tue, Jul 24, 2012 at 11:13 PM, wrote: > Hi, > > here are the results of the corosync status. Can´t find a problem there: > > pilotpound: > > [root@pilotpound ~]# corosync-cfgtool -s > Printing ring status. > Local node ID 425699520 > RING ID 0 > id = 192.168.95.25 > status

Re: [Pacemaker] None of the standard agents in ocf:heartbeat are working in centos 6

2012-07-29 Thread Andrew Beekhof
On Tue, Jul 24, 2012 at 2:25 PM, Vladislav Bogdanov wrote: > 24.07.2012 04:50, Andrew Beekhof wrote: >> On Tue, Jul 24, 2012 at 5:38 AM, David Barchas wrote: >>> >>> On Monday, July 23, 2012 at 7:48 AM, David Barchas wrote: >>> >>> >>> Date: Mon, 23 Jul 2012 14:15:27 +0300 >>> From: Vladislav Bog

Re: [Pacemaker] Some errors after upgrading from heartbeat 2.1 cluster

2012-07-29 Thread Andrew Beekhof
On Tue, Jul 24, 2012 at 11:59 PM, Drew Morone wrote: > I replaced "target_role" with "target-role", but I get the same error > regarding "target-role" (eg. ERROR: DRBD_data: attribute target_role does > not exist). You have it defined twice. The error is complaining about this line: just r

Re: [Pacemaker] Help with N+1 configuration

2012-07-29 Thread Andrew Beekhof
On Sat, Jul 28, 2012 at 4:00 AM, Cal Heldenbrand wrote: > Would you be able to direct me to some documentation for configuring STONITH > based on my environment? The Clusters from Scratch document talks about a > fence_ipmilan driver, which I seem to not have on my centos 6 install. I > only sh

Re: [Pacemaker] "No respawn" for cluster resources

2012-07-29 Thread Andrew Beekhof
On Sat, Jul 28, 2012 at 12:25 AM, Stallmann, Andreas wrote: > Hi there! > > > > I just asked a few hours ago: > > > > Ø Is there a way to stop crm/pacemaker from doing that (an automatic > respawn of a resource)? > > > > Thing is: This “automatic respawn” does not happen for our mysql m/s > resou

Re: [Pacemaker] Why was SBD removed from the RHEL/CentOS 6 cluster-glue/corosync/pacemaker packages?

2012-07-29 Thread Andrew Beekhof
On Sun, Jul 29, 2012 at 8:30 AM, mark - pacemaker list wrote: > Hello list, > > If you're building cluster-glue from source, it builds sbd. However, If you > install cluster-glue, corosync, and pacemaker from official repos, there is > no sbd binary. The deb for cluster-glue in Debian is version

Re: [Pacemaker] Complicated dependences between resources and nodes

2012-07-29 Thread Phil Frost
On 07/28/2012 06:46 AM, Antonis Christofides wrote: Hi, short questions: Is it possible to dictate that resource R1 runs on a different node than resource R2? Yes. http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s04s02.html Is it possible when moving R1 from