Re: [Pacemaker] Failover configuration question

2013-10-15 Thread Sean Lutner
On Oct 15, 2013, at 7:12 PM, Andrew Beekhof wrote: > > On 04/10/2013, at 12:51 AM, Sean Lutner wrote: > >> Hello, >> I'm hoping to get some assistance with a cluster configuration I'm currently >> working on. >> >> The cluster is built on CentOS 6.4 Amazon EC2 systems with: >> - pacema

Re: [Pacemaker] Service restoration in clone resource group

2013-10-15 Thread Sean Lutner
On Oct 15, 2013, at 6:21 PM, Andrew Beekhof wrote: > > On 10/10/2013, at 12:52 PM, Sean Lutner wrote: > >> >> On Oct 8, 2013, at 9:45 AM, Sean Lutner wrote: >> >>> >>> On Oct 8, 2013, at 9:33 AM, Lars Marowsky-Bree wrote: >>> On 2013-10-08T09:29:14, Sean Lutner wrote: >

Re: [Pacemaker] Failover configuration question

2013-10-15 Thread Andrew Beekhof
On 04/10/2013, at 12:51 AM, Sean Lutner wrote: > Hello, > I'm hoping to get some assistance with a cluster configuration I'm currently > working on. > > The cluster is built on CentOS 6.4 Amazon EC2 systems with: > - pacemaker-1.1.8-7.el6.x86_64 > - cman-3.0.12.1-49.el6_4.2.x86_64

Re: [Pacemaker] do_state_transition: Starting PEngine Recheck Timer

2013-10-15 Thread Andrew Beekhof
On 10/10/2013, at 4:12 PM, Xiaomin Zhang wrote: > Hi, Gurus: > Do you ever see the message: > do_state_transition: Starting PEngine Recheck Timer > occurring when doing fail over? > I found that while PEngine is starting the rechecker timer, the cluster seems > split brain, that all nodes are n

Re: [Pacemaker] Service restoration in clone resource group

2013-10-15 Thread Andrew Beekhof
On 10/10/2013, at 12:52 PM, Sean Lutner wrote: > > On Oct 8, 2013, at 9:45 AM, Sean Lutner wrote: > >> >> On Oct 8, 2013, at 9:33 AM, Lars Marowsky-Bree wrote: >> >>> On 2013-10-08T09:29:14, Sean Lutner wrote: >>> The clone was created using the interleave=true option, yes. You mi

Re: [Pacemaker] Bug? failed to stonith with fence_ipmilan on CentOS6.2

2013-10-15 Thread Andrew Beekhof
On 09/10/2013, at 1:53 PM, Xiaomin Zhang wrote: > I think I know why this happened after I enabled 'verbose' for fence_ipmilan. > When I firstly configure stonith, I set lanplus as true, however, my machine > is not HP one so lanplus is not supported. When I notice this, I use 'crm > configur

Re: [Pacemaker] [pacemaker] DRBD + corosync + pacemaker + postgresql

2013-10-15 Thread emmanuel segura
Check if your postgres is stopped or not start in the boot time. 2013/10/15 Thomaz Luiz Santos > dear all :-D > > I remake my crm config > > node ha-master > node ha-slave > primitive drbd_postgresql ocf:linbit:drbd \ > params drbd_resource="postgresql" \ > op monitor interval="

Re: [Pacemaker] pacemaker shutdown under high load

2013-10-15 Thread Andrew Beekhof
On 09/10/2013, at 10:53 PM, Alessandro Bono wrote: > Hi > > > this week end my pacemaker shutdown on primary node during machine backup > attached compressed log of primary node, logs of secondary node is too big, > if needed I can provide as external link > inspecting logs I found these erro

Re: [Pacemaker] Rule constraint monitoring interval

2013-10-15 Thread Andrew Beekhof
On 10/10/2013, at 3:22 AM, Sam Gardner wrote: > As I understand it, there are two ways to monitor the status of a resource. > > 1) Use the monitor action on the resource agent script - this is equivalent > to polling the resource at every > > 2) Write a value into the cib, and create a const

Re: [Pacemaker] Missing ha_logger command

2013-10-15 Thread Andrew Beekhof
On 10/10/2013, at 3:46 AM, Dejan Muhamedagic wrote: > On Wed, Oct 09, 2013 at 06:07:40PM +0200, Patrick Lists wrote: >> On 10/09/2013 04:03 PM, Dejan Muhamedagic wrote: >> [snip] Sorry, I missed to mention the version of cluster-glue. It is installed on the nodes: clus

Re: [Pacemaker] ping monitor

2013-10-15 Thread Andrew Beekhof
On 14/10/2013, at 7:51 PM, s.oreilly wrote: > Hi, > > I am setting up a 2 node mysql cluster using pcs instead of crm for the first > time and have everything working nicely except my ping monitor. > Ping is running on both nodes but I can't figure out how to configure the > location resource t

Re: [Pacemaker] Question about the resource to fence a node

2013-10-15 Thread Andrew Beekhof
On 15/10/2013, at 8:24 PM, Kazunori INOUE wrote: > Hi, > > I'm using pacemaker-1.1 (the latest devel). > I started resource (f1 and f2) which fence vm3 on vm1. > > $ crm_mon -1 > Last updated: Tue Oct 15 15:16:37 2013 > Last change: Tue Oct 15 15:16:21 2013 via crmd on vm1 > Stack: corosync >

Re: [Pacemaker] Colocation constraint to External Managed Resource

2013-10-15 Thread Lars Ellenberg
On Tue, Oct 15, 2013 at 11:28:14PM +0200, Robert H. wrote: > Hi, > > I finally got it working. > > I had to set cluster-recheck-interval="5m" or some other value and Right. Failure timeout is only evaluated on the next pengine run, so if nothing else happens, it takes up to recheck-interval ...

Re: [Pacemaker] Colocation constraint to External Managed Resource

2013-10-15 Thread Robert H.
Hi, I finally got it working. I had to set cluster-recheck-interval="5m" or some other value and had to set failure-timeout to the same value (failure-timeout="5m"). This causes a "probe" after 5 minutes and then the cluster shows the correct state and reevaluates the engine. So the very st

Re: [Pacemaker] [pacemaker] DRBD + corosync + pacemaker + postgresql

2013-10-15 Thread Thomaz Luiz Santos
dear all :-D I remake my crm config node ha-master node ha-slave primitive drbd_postgresql ocf:linbit:drbd \ params drbd_resource="postgresql" \ op monitor interval="30" role="Master" \ op monitor interval="33" role="Slave" primitive fs_postgresql ocf:heartbeat:Filesystem

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Lars Marowsky-Bree
On 2013-10-15T17:05:39, Robert Lindgren wrote: > Worked like a charm, except: > > crm(live)configure# simulate actions nograph > > which I guess is only is available in newer versions. If you have an older version, you can run "ptest" instead of "simulate". ("ptest" of course still works in ne

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Robert Lindgren
Worked like a charm, except: crm(live)configure# simulate actions nograph which I guess is only is available in newer versions. So big thanks Lars! On Tue, Oct 15, 2013 at 12:43 PM, Robert Lindgren wrote: > Excellent, thanks Lars, this looks the proper way forward! Cheers > > > On Tue, Oct 1

Re: [Pacemaker] Monitoring on master node not running after standby is connected

2013-10-15 Thread Juraj Fabo
Juraj Fabo writes: > > Hello Andrew > > > thank you for the response. > > I've patched crmd, cleaned the cluster, done the scenario steps and created crm_report which is attached. > > After loading the cluster configuration both nodes were running the IFDS- Stateful resource monitoring prope

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Robert Lindgren
Excellent, thanks Lars, this looks the proper way forward! Cheers On Tue, Oct 15, 2013 at 12:38 PM, Lars Marowsky-Bree wrote: > On 2013-10-15T09:39:25, Robert Lindgren wrote: > > What I'd do is to backup, then wipe the cluster configuration > (/var/lib/pacemaker/cib/*), restart with the empty

Re: [Pacemaker] Question about the resource to fence a node

2013-10-15 Thread Lars Marowsky-Bree
On 2013-10-15T18:24:46, Kazunori INOUE wrote: > Oct 15 15:17:16 vm2 stonith-ng[9160]: warning: log_operation: f1:9273 > [ Performing: stonith -t external/libvirt -T reset vm3 ] > Oct 15 15:17:46 vm2 stonith-ng[9160]: warning: log_operation: f1:9588 > [ Performing: stonith -t external/libvirt -T

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Lars Marowsky-Bree
On 2013-10-15T09:39:25, Robert Lindgren wrote: What I'd do is to backup, then wipe the cluster configuration (/var/lib/pacemaker/cib/*), restart with the empty configuration (which will also help with ids that have changed etc). And then: # crm configure crm(live)configure# load xml replace /pat

Re: [Pacemaker] node trying to run resource even in standby mode

2013-10-15 Thread Lars Marowsky-Bree
On 2013-10-10T14:41:49, Lev Sidorenko wrote: > When I create resource like: > # pcs resource create myres lsb:myres > it is created and can see straight away in crm_mon: > - > myres (lsb:myres):Started node3 (unmanaged) FAILED > Failed actions: > myres_stop

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread emmanuel segura
Backup your current cib.xml and modify by hand and after that, try to start the cluster 2013/10/15 Robert Lindgren > Yeah it's a nice idea, but servers are at datacenter, some hours away. > > > On Tue, Oct 15, 2013 at 10:42 AM, Florian Crouzat < > gen...@floriancrouzat.net> wrote: > >> Le 15/1

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Robert Lindgren
Yeah it's a nice idea, but servers are at datacenter, some hours away. On Tue, Oct 15, 2013 at 10:42 AM, Florian Crouzat wrote: > Le 15/10/2013 09:39, Robert Lindgren a écrit : > > I have a cluster that is offline, and I can't start it to do edits >> (since IPs and so will conflict with old cl

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread emmanuel segura
+1 2013/10/15 Florian Crouzat > Le 15/10/2013 09:39, Robert Lindgren a écrit : > > I have a cluster that is offline, and I can't start it to do edits >> (since IPs and so will conflict with old cluster). What is the preferred >> way of doing the edits (change IPs) so that I can start the clust

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Florian Crouzat
Le 15/10/2013 09:39, Robert Lindgren a écrit : I have a cluster that is offline, and I can't start it to do edits (since IPs and so will conflict with old cluster). What is the preferred way of doing the edits (change IPs) so that I can start the cluster? Can't you start only one of the node un

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread Robert Lindgren
All my virtual IPs are surely in the cib.xml, trust me on that. On Tue, Oct 15, 2013 at 10:12 AM, emmanuel segura wrote: > Why modify the cib.xml? in cib.xml there is no reference to ip, i think > you have hostname there, i think you need to edit /etc/hosts and try to > start the cluster again >

Re: [Pacemaker] Offline Cluster edit

2013-10-15 Thread emmanuel segura
Why modify the cib.xml? in cib.xml there is no reference to ip, i think you have hostname there, i think you need to edit /etc/hosts and try to start the cluster again 2013/10/15 Robert Lindgren > Hi, > > I have a cluster that is offline, and I can't start it to do edits (since > IPs and so wil

[Pacemaker] Offline Cluster edit

2013-10-15 Thread Robert Lindgren
Hi, I have a cluster that is offline, and I can't start it to do edits (since IPs and so will conflict with old cluster). What is the preferred way of doing the edits (change IPs) so that I can start the cluster? Will a normal vim edit on cib.xml be good enough, I read that wasn't an option in th