[Pacemaker] unexpected Error in Log files

2011-07-18 Thread rakesh
Hi I configured a cluster which consists of four nodes, started Heartbeat/pacemaker on four nodes. after some point of time 4th nodes gone down unexpectedly and find the following error messages while debugging all the log files like ha-debug and messages.log file . can you please help me out

Re: [Pacemaker] The placement strategy of the group resource does not work well

2011-07-18 Thread Andrew Beekhof
This should now be fixed in: http://hg.clusterlabs.org/pacemaker/devel/rev/960a7e3da680 Its based on your patches but is a little more generic. On Mon, Jul 11, 2011 at 10:22 PM, Yuusuke IIDA wrote: > Hi, Yan > > After trying your correction , I had the following problems. > >  * When trouble

Re: [Pacemaker] A question and demand to a resource placement strategy function

2011-07-18 Thread Andrew Beekhof
This should also now be fixed in: http://hg.clusterlabs.org/pacemaker/devel/rev/960a7e3da680 On Tue, Jul 5, 2011 at 9:43 PM, Yuusuke IIDA wrote: > Hi, Andrew > > I know that there is the next processing in "pengine". > > # cat -n pengine/utils.c > [snip] >   322      /* now try to balance reso

Re: [Pacemaker] Cluster with DRBD : split brain

2011-07-18 Thread Andrew Beekhof
On Fri, Jul 15, 2011 at 7:58 PM, Hugo Deprez wrote: > Dear community, > > I am running on Debian Lenny, a cluster with corosync. I have : > > One DRBD partition and 4 resources : > > fs-data    (ocf::heartbeat:Filesystem): > mda-ip (ocf::heartbeat:IPaddr2): > postfix    (ocf::heartbeat:postfix

Re: [Pacemaker] user permissions to start / stop certain resource groups

2011-07-18 Thread Andrew Beekhof
On Mon, Jul 18, 2011 at 11:37 PM, Tegtmeier.Martin wrote: > Hello Andrew, > > is it possible to create user based permissions allowing certain OS users to > start / stop a resource group? I believe so. > These users should NOT be able to alter the > cluster / resource configuration. Well they d

Re: [Pacemaker] Trouble getting fence_apc working on RHEL 6.1

2011-07-18 Thread Andrew Beekhof
On Tue, Jul 19, 2011 at 10:03 AM, Digimer wrote: > On 07/18/2011 08:02 PM, Andrew Beekhof wrote: >> On Tue, Jul 19, 2011 at 9:42 AM, Digimer wrote: >>> On 07/18/2011 07:26 PM, Andrew Beekhof wrote: On Mon, Jul 18, 2011 at 7:07 AM, Uwe Grawert wrote: > Am 17.07.11 22:49, schrieb Digimer:

Re: [Pacemaker] crm_cluster_connect: Triggered fatal assert at cluster.c:65 : hb_conn != NULL

2011-07-18 Thread Andrew Beekhof
On Tue, Jul 19, 2011 at 9:58 AM, Andrew Beekhof wrote: > On Tue, Jul 19, 2011 at 1:17 AM, Nikita Michalko > wrote: >> Hi all! >> >> I have succesfully configured and running 2-nodes-cluster. By testing >> different scenaries became I that error. >> Situation: >> 1st node was running, the 2nd was

Re: [Pacemaker] Trouble getting fence_apc working on RHEL 6.1

2011-07-18 Thread Digimer
On 07/18/2011 08:02 PM, Andrew Beekhof wrote: > On Tue, Jul 19, 2011 at 9:42 AM, Digimer wrote: >> On 07/18/2011 07:26 PM, Andrew Beekhof wrote: >>> On Mon, Jul 18, 2011 at 7:07 AM, Uwe Grawert wrote: Am 17.07.11 22:49, schrieb Digimer: > On 07/17/2011 04:36 PM, Uwe Grawert wrote: >>

Re: [Pacemaker] Trouble getting fence_apc working on RHEL 6.1

2011-07-18 Thread Andrew Beekhof
On Tue, Jul 19, 2011 at 9:42 AM, Digimer wrote: > On 07/18/2011 07:26 PM, Andrew Beekhof wrote: >> On Mon, Jul 18, 2011 at 7:07 AM, Uwe Grawert wrote: >>> Am 17.07.11 22:49, schrieb Digimer: On 07/17/2011 04:36 PM, Uwe Grawert wrote: > Am 17.07.11 20:32, schrieb Digimer: >> I've been

Re: [Pacemaker] crm_cluster_connect: Triggered fatal assert at cluster.c:65 : hb_conn != NULL

2011-07-18 Thread Andrew Beekhof
On Tue, Jul 19, 2011 at 1:17 AM, Nikita Michalko wrote: > Hi all! > > I have succesfully configured and running 2-nodes-cluster. By testing > different scenaries became I that error. > Situation: > 1st node was running, the 2nd was rebooted and heartbeat started only on the > 1st node - it was OK,

Re: [Pacemaker] Trouble getting fence_apc working on RHEL 6.1

2011-07-18 Thread Digimer
On 07/18/2011 07:26 PM, Andrew Beekhof wrote: > On Mon, Jul 18, 2011 at 7:07 AM, Uwe Grawert wrote: >> Am 17.07.11 22:49, schrieb Digimer: >>> On 07/17/2011 04:36 PM, Uwe Grawert wrote: Am 17.07.11 20:32, schrieb Digimer: > I've been trying to get my APC switched PDU working with Pacemake

Re: [Pacemaker] Trouble getting fence_apc working on RHEL 6.1

2011-07-18 Thread Andrew Beekhof
On Mon, Jul 18, 2011 at 7:07 AM, Uwe Grawert wrote: > Am 17.07.11 22:49, schrieb Digimer: >> On 07/17/2011 04:36 PM, Uwe Grawert wrote: >>> Am 17.07.11 20:32, schrieb Digimer: I've been trying to get my APC switched PDU working with Pacemaker using Red Hat's fence_apc fence agent. The ag

Re: [Pacemaker] Group not started/stopped in correct order with -INF collocation

2011-07-18 Thread Andrew Beekhof
On Mon, Jul 18, 2011 at 6:38 PM, Kulovits Christian - OS ITSC < christian.kulov...@austrian.com> wrote: > Hi Andrew, > > ** ** > > But whats about the –Inf collocation. The groups must not be active on the > same node. This implies an order in shutting down and starting up, isn´t it? > Coloca

Re: [Pacemaker] Upgrading from 1.0 to 1.1

2011-07-18 Thread Andrew Beekhof
On Fri, Jul 15, 2011 at 10:33 PM, Proskurin Kirill wrote: > Hello all. > > I found what I using corosync with pacemaker "ver:0" with installed > pacemaker 1.1.5 - eg without start a pacemakerd. > > Sounds wrong. :-) > So I try to upgrade. > I shutdown one node. Change 0 to 1 on service.d/pcmk > St

[Pacemaker] crm_cluster_connect: Triggered fatal assert at cluster.c:65 : hb_conn != NULL

2011-07-18 Thread Nikita Michalko
Hi all! I have succesfully configured and running 2-nodes-cluster. By testing different scenaries became I that error. Situation: 1st node was running, the 2nd was rebooted and heartbeat started only on the 1st node - it was OK, all resources were running on the 1st node. Then I removed on the 2

[Pacemaker] user permissions to start / stop certain resource groups

2011-07-18 Thread Tegtmeier.Martin
Hello Andrew, is it possible to create user based permissions allowing certain OS users to start / stop a resource group? These users should NOT be able to alter the cluster / resource configuration. Background: Different SAP Systems running inside one cluster. The hardware, OS and cluster adm

Re: [Pacemaker] Group not started/stopped in correct order with -INF collocation

2011-07-18 Thread Kulovits Christian - OS ITSC
Hi Andrew, But whats about the -Inf collocation. The groups must not be active on the same node. This implies an order in shutting down and starting up, isn´t it? On the other hand, if both groups gets started on different nodes there should not be any order dependency since startup of one grou