[Pacemaker] crm commands are blocked by starting services

2014-07-14 Thread Johan Huysmans
Hi All, I noticed that any changes made with crm are not applied to the cluster configuration when a resource is being started. I have a resource which can take some time to be started (2minutes). Whenever such resource is started any change made by executing the crm command is hold until th

Re: [Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

2014-06-13 Thread Johan Huysmans
I tested with a custom build libqb rpm (libqb-0.17.0-1.15.f8b4.dirty.el6.x86_64) however that didn't solved the issue. So for now I will add the export of the PCMK_ipc_buffer to my bash_profile. thx Johan On 13-06-14 11:23, Johan Huysmans wrote: I exported the PCMK_ipc_buffer in my co

Re: [Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

2014-06-13 Thread Johan Huysmans
, Andrew Beekhof wrote: On 13 Jun 2014, at 5:35 pm, Johan Huysmans wrote: Hi, The PCMK_ipc_buffer was already set to 1000 (10M). Try setting that in your environment too (otherwise the CLI tools wont know about the new value). Or grab a newer libqb which _should_ do the right thing anyway

Re: [Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

2014-06-13 Thread Johan Huysmans
If this file must store the complete cib it is too small to store our complete cib, which could explain why our crm_mon isn't working but the write actions are giving no problems. gr. Johan On 13-06-14 09:35, Johan Huysmans wrote: Hi, The PCMK_ipc_buffer was already set to 1000 (10M).

Re: [Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

2014-06-13 Thread Johan Huysmans
is it something else ? If I check the cib.xml files in /var/lib/pacemaker/cib/ all files are a bit smaller then 300K. Changing these buffers did not solve my problem not getting results from crm_mon. Gr. Johan On 13-06-14 01:13, Andrew Beekhof wrote: On 12 Jun 2014, at 10:53 pm, Johan

[Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

2014-06-12 Thread Johan Huysmans
d type Restarting the pacemaker and cman service of 1 node didn't solve it. What is causing this problem and how can I resolve it ? Thx, Johan Huysmans ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/m

Re: [Pacemaker] resource with colocation rule doesn't fail

2013-08-12 Thread Johan Huysmans
/2013, at 12:28 PM, Andrew Beekhof wrote: On 02/08/2013, at 5:56 PM, Johan Huysmans wrote: Hi Andrew, Thanks for the fix. I tried it on my setup and now when a cloned resource fails the group will move to the other node as expected. However I noticed something strange. If a cloned resource is

Re: [Pacemaker] resource with colocation rule doesn't fail

2013-08-02 Thread Johan Huysmans
On 02-08-13 03:33, Andrew Beekhof wrote: On 01/08/2013, at 5:38 PM, Johan Huysmans wrote: I forgot to mention: I'm using a build from git (Version: 1.1.11-1.el6-42f2063). I used the same config on an old 1.1.10 rc (rc6 or before) and that worked, as of rc7 it didn't work anymor

Re: [Pacemaker] resource with colocation rule doesn't fail

2013-08-01 Thread Johan Huysmans
I forgot to mention: I'm using a build from git (Version: 1.1.11-1.el6-42f2063). I used the same config on an old 1.1.10 rc (rc6 or before) and that worked, as of rc7 it didn't work anymore. On 01-08-13 09:35, Johan Huysmans wrote: Hi, I have a cloned resource and a resource g

Re: [Pacemaker] 1.1.10 rc7 + final strange behaviour

2013-07-31 Thread Johan Huysmans
On 31-07-13 03:33, Andrew Beekhof wrote: On 30/07/2013, at 12:29 AM, Johan Huysmans wrote: Hi, I was testing the latest rc7 and the final version of 1.1.10. My test is the combination of cloned resources and 1 resource group containing constraints that the resource group can only run on a

Re: [Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-16 Thread Johan Huysmans
Hi, Attached is some more logging of the failed diff: I hope this is sufficient information to help investigate the problem. Thx. Johan On 15-07-13 03:02, Andrew Beekhof wrote: On 11/07/2013, at 5:09 PM, Johan Huysmans wrote: Hi, Sorry about the missing info, here it is: OS: CentOS 6.4

Re: [Pacemaker] Different behaviour for cloned resource on 1 and 2 nodes

2013-07-15 Thread Johan Huysmans
Hi, I created a bug (http://bugs.clusterlabs.org/show_bug.cgi?id=5170) containing extra explanation and the crm_report. Thanks for taking your time to check this problem! gr. Johan On 15-07-13 04:11, Andrew Beekhof wrote: On 11/07/2013, at 12:21 AM, Johan Huysmans wrote: Hi All, I have

Re: [Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-11 Thread Johan Huysmans
informations about your stack: - OS - corosync version (or heartbeat) - pacemaker version - agent version - etc. Best regards Andreas -Ursprüngliche Nachricht- Von: Johan Huysmans [mailto:johan.huysm...@inuits.be] Gesendet: Mittwoch, 10. Juli 2013 15:17 An: The Pacemaker cluster resource

[Pacemaker] Different behaviour for cloned resource on 1 and 2 nodes

2013-07-10 Thread Johan Huysmans
Hi All, I have a setup with a cloned resource and a resource group. I also configured some colocation and order rules in such way that the group can only run where the cloned resource is running. On a 2 node setup I have no problems. The group moves away when the local clone resource fails. T

[Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-10 Thread Johan Huysmans
: cib_process_diff: Diff 0.90.29 -> 0.90.30 from local not applied to 0.90.29: Failed application of an update diff stonith-ng[25994]: notice: update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) thx! Johan Huysm

Re: [Pacemaker] Shutdown of pacemaker service takes 20 minutes

2013-06-04 Thread Johan Huysmans
Hi, Adding a timeout for the stop() operation worked like a charm. Thanks for the input! gr. Johan On 30-05-13 14:20, Florian Crouzat wrote: Le 30/05/2013 13:57, Johan Huysmans a écrit : When my resource has received the stop command, it will stop, but this takes some time. When the status

[Pacemaker] Shutdown of pacemaker service takes 20 minutes

2013-05-30 Thread Johan Huysmans
orkaround this issue? Greetings, Johan Huysmans ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_f

Re: [Pacemaker] Release candidate: 1.1.10-rc3

2013-05-23 Thread Johan Huysmans
Hi All, I've builded an rpm as described below, however I can see during the build that rc2 is used in there is no mentioning of rc3. It seems that there is no rc3 tag available: $ git tag -l | grep Pacemaker | sort -Vr | grep rc Pacemaker-1.1.10-rc2 Pacemaker-1.1.10-rc1 gr. Johan On 23-05-1

Re: [Pacemaker] failure handling on a cloned resource

2013-05-15 Thread Johan Huysmans
ed? I will currently use the head build I created. This is ok for my testsetup but I don't want to run this version in production Greetings, Johan Huysmans On 2013-05-10 06:55, Andrew Beekhof wrote: Fixed! https://github.com/beekhof/pacemaker/commit/d87de1b On 10/05/2013, at 11:59

Re: [Pacemaker] failure handling on a cloned resource

2013-05-07 Thread Johan Huysmans
wrote: I have a much clearer idea of the problem you're seeing now, thankyou. Could you attach /var/lib/pacemaker/pengine/pe-input-1.bz2 from CSE-1 ? On 03/05/2013, at 10:40 PM, Johan Huysmans wrote: Hi, Below you can see my setup and my test, this shows that my cloned resource with on

Re: [Pacemaker] failure handling on a cloned resource

2013-05-03 Thread Johan Huysmans
_3.log) # crm resource status Resource Group: svc-cse ip_19(ocf::heartbeat:IPaddr2):Stopped ip_11(ocf::heartbeat:IPaddr2):Stopped Clone Set: cl_tomcat [d_tomcat] d_tomcat:0(ocf::ntc:tomcat):Started (unmanaged) FAILED Stopped: [ d_tomcat:1 ] As you can se

Re: [Pacemaker] failure handling on a cloned resource

2013-05-02 Thread Johan Huysmans
On 2013-05-01 05:48, Andrew Beekhof wrote: On 17/04/2013, at 9:54 PM, Johan Huysmans wrote: Hi All, I'm trying to setup a specific configuration in our cluster, however I'm struggling with my configuration. This is what I'm trying to achieve: On both nodes of the cluster a

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-24 Thread Johan Huysmans
bug and crm_report created: http://bugs.clusterlabs.org/show_bug.cgi?id=5021 gr. Johan On 24-04-13 13:40, Johan Huysmans wrote: On 24-04-13 13:24, Lars Marowsky-Bree wrote: On 2013-04-24T10:37:24, Johan Huysmans wrote: --> start situation * scope=status name=fail-count-d_tomcat valu

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-24 Thread Johan Huysmans
On 24-04-13 13:24, Lars Marowsky-Bree wrote: On 2013-04-24T10:37:24, Johan Huysmans wrote: --> start situation * scope=status name=fail-count-d_tomcat value=0 * depending resource group running on node * crm_mon shows everything ok --> a failure occurs * scope=status name=fail

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-24 Thread Johan Huysmans
* BUT my resource is still monitored and failing! I find it disturbing that a resource with a failing monitor has a 0 failcount, shows ok in crm_mon and allows to run the depending resources. gr. Johan On 24-04-13 08:35, Johan Huysmans wrote: I tried the failure-timeout. But I noticed

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-24 Thread Johan Huysmans
; > gr. > Johan > > On 24-04-13 07:23, Andrew Beekhof wrote: > > On 23/04/2013, at 11:24 PM, Johan Huysmans wrote: > >> Hi All, > >> > >> I have a cloned resource, running on my both nodes, my on-fail is set to > >> block. So i

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Johan Huysmans
ndrew Beekhof wrote: On 23/04/2013, at 11:24 PM, Johan Huysmans wrote: Hi All, I have a cloned resource, running on my both nodes, my on-fail is set to block. So if the resource fails on a node the failcount increases, but whenever the resource automatically recovers the failcount isn't reset.

[Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Johan Huysmans
Hi All, I have a cloned resource, running on my both nodes, my on-fail is set to block. So if the resource fails on a node the failcount increases, but whenever the resource automatically recovers the failcount isn't reset. Is there a way to reset the failcount to 0, when the monitor is succe

[Pacemaker] rhel6 rpm provided by clusterlabs.org

2013-04-22 Thread Johan Huysmans
an as cluster infrastructure. Can this support be included in the rpm build for CentOS6? in such way it is a drop-in replacement for the pacemaker provided by CentOS6. Thx! Gr. Johan Huysmans ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

Re: [Pacemaker] failure handling on a cloned resource

2013-04-22 Thread Johan Huysmans
Hi All, I've created a bug for this issue i'm having http://bugs.clusterlabs.org/show_bug.cgi?id=5154 I think this is a bug due to the fact that it worked on older releases. Can someone verify it really is a bug, or just a configuration mistake. Thanks! Greetings, Johan Huysmans

[Pacemaker] failure handling on a cloned resource

2013-04-17 Thread Johan Huysmans
ot; group svc-cse ip_1 ip_2 clone cl_tomcat d_tomcat colocation colo_tomcat inf: svc-cse cl_tomcat order order_tomcat inf: cl_tomcat svc-cse property $id="cib-bootstrap-options" \ dc-version="1.1.8-7.el6-394e906" \ cluster-infrastructure="cman" \ no-quoru