Re: [Pacemaker] i try to replace current configure with empty file , but failed

2010-06-08 Thread Andrew Beekhof
On Mon, Jun 7, 2010 at 9:35 AM, ch huang wrote: > i make an empty file tmp.xml: > > > > > > > > > > > > > and issue the command "cibadmin --replace --xml-file tmp.xml" ,try to erase > all configure in cluster,but failed with error > > how to do this(erase curre

Re: [Pacemaker] question re. upgrading Lenny

2010-06-08 Thread Andrew Beekhof
On Wed, Jun 9, 2010 at 2:37 AM, Miles Fidelman wrote: > Perhaps someone can offer a suggestion here. > > I'm currently running heartbeat2 crm or haresources? > on a 2-node Debian Lenny cluster.  I just > finished reading the Debian Lenny pacemaker install instructions at > > Those seem to be sta

Re: [Pacemaker] 2 node cluster with clvm, configuration help needed...

2010-06-08 Thread Andrew Beekhof
On Fri, Jun 4, 2010 at 10:03 AM, Dejan Muhamedagic wrote: > On Thu, Jun 03, 2010 at 07:57:59AM +0200, Andrew Beekhof wrote: >> On Wed, Jun 2, 2010 at 1:25 PM,   wrote: >> > >> > >> > >> > >> > >> > hy, >> > >> > thx for your reply. >> > >> > I installed python-curses and xml, but didn't help. >> >

Re: [Pacemaker] why crmd can not be finished?

2010-06-08 Thread Andrew Beekhof
On Mon, Jun 7, 2010 at 4:28 AM, ch huang wrote: > OK,now finished ,the tiime very long .. If you have a database that takes an hour to shutdown, this normal. If not there could be a bug. But you didnt include the config so there is no way to know. > > Waiting for corosync services to > unload:.

Re: [Pacemaker] use_logd or use_mgmtd kills corosync

2010-06-08 Thread Steven Dake
On 06/08/2010 11:20 PM, Andrew Beekhof wrote: On Wed, Jun 9, 2010 at 7:27 AM, Devin Reade wrote: I was following the instructions for a new installation of corosync and was wanting to make use of hb_gui so, following an installation via yum per the docs, built Pacemaker-Python-GUI-pacemaker-mgm

Re: [Pacemaker] Configuration 1.0 Explained pdf missing

2010-06-08 Thread Andrew Beekhof
Sorry, publican changed the generated file name. I'll fix the link (btw. http://www.clusterlabs.org/doc has a nice auto-generated index which always works :-) On Tue, Jun 8, 2010 at 2:15 AM, David&Donna Livingstone wrote: > > From the http://www.clusterlabs.org/wiki/Documentation page the > Confi

Re: [Pacemaker] use_logd or use_mgmtd kills corosync

2010-06-08 Thread Andrew Beekhof
On Wed, Jun 9, 2010 at 7:27 AM, Devin Reade wrote: > I was following the instructions for a new installation of corosync > and was wanting to make use of hb_gui so, following an installation > via yum per the docs, built Pacemaker-Python-GUI-pacemaker-mgmt-2.0.0 > from source. > > Starting corosyn

[Pacemaker] use_logd or use_mgmtd kills corosync

2010-06-08 Thread Devin Reade
I was following the instructions for a new installation of corosync and was wanting to make use of hb_gui so, following an installation via yum per the docs, built Pacemaker-Python-GUI-pacemaker-mgmt-2.0.0 from source. Starting corosync works normally without mgmtd in the picture, but as soon as *

[Pacemaker] Specifying stop and demote order

2010-06-08 Thread Stefan Foerster
Hello world, I have a number of primitives and a master/slave resource which need to be started in a given order. At first I was thinking about using a "group", but as things go with groups, seemingly independent resources might start to work if one of the group's resources fails (you cant specify

Re: [Pacemaker] Cluster frozen after "crm resource cleanup"

2010-06-08 Thread Stefan Foerster
* Dejan Muhamedagic : > > http://www.incertum.net/~cite/messages.mudslide1 > > http://www.incertum.net/~cite/messages.mudslide2 [...] > Please make a hb_report for this incident and open a bugzilla. Unfortunately, I had some time constraints, so I completely forgot to capture that hb_report. Howev

[Pacemaker] Clones restart on node recovery

2010-06-08 Thread jraditch...@gmail.com
Hi hopefully osmeone can help. I have little experience with pacemaker and possibly I do something wrong. I have the follwoing design: Two hardware nodes Part of the services are 100% redundant on both nodes - we use Clones for them they are redundant and essential for the system to run at least

[Pacemaker] question re. upgrading Lenny

2010-06-08 Thread Miles Fidelman
Perhaps someone can offer a suggestion here. I'm currently running heartbeat2 on a 2-node Debian Lenny cluster. I just finished reading the Debian Lenny pacemaker install instructions at Those seem to be start-from-scratch instructions. Can anyone offer any suggestions re. : 1. migrating (

Re: [Pacemaker] pingd problems

2010-06-08 Thread Dalibor Dukic
On Tue, 2010-06-08 at 19:08 +0200, Dejan Muhamedagic wrote: > Not sure, but I think that the default for the attribute name is > "pingd". Try changing L3_ping to pingd in the constraints. Dejan, thanks a lot for pointing out error, I really appreciate it. I've changed the attribute name to 'pin

Re: [Pacemaker] Issues with constraints - working for start/stop, being ignored on "failures"

2010-06-08 Thread Dejan Muhamedagic
Hi, On Sun, Jun 06, 2010 at 07:07:32PM -0600, Tim Serong wrote: > On 6/2/2010 at 11:10 AM, Cnut Jansen wrote: > > Am 31.05.2010 05:47, schrieb Tim Serong: > > > On 5/31/2010 at 12:57 PM, Cnut Jansen wrote: > > > > > >> Current constraints: > > >> colocation TEST_colocO2cb inf: cloneO2cb clo

Re: [Pacemaker] Pacemaker resource management

2010-06-08 Thread Dejan Muhamedagic
Hi, On Sun, Jun 06, 2010 at 05:47:35PM +0300, Dan Frincu wrote: > Hello all, > > I have a couple of questions and I haven't found any relevant > documentation about it so I would appreciate any answers on the > matter. > > I'm using drbd 8.3.2-6 with pacemaker 1.0.5-4.2, openais 0.80.5-15.2 > an

Re: [Pacemaker] active-active setup with crm clone and load balancing

2010-06-08 Thread Dejan Muhamedagic
Hi, On Sun, Jun 06, 2010 at 03:11:10PM +0200, Tomas Kouba wrote: > Hello all, > > I am running a simple information system that does not need to have > a backend storage. > I would like to run it in two instances on two nodes and have them in > high available and load balancing setup. So the beha

Re: [Pacemaker] pingd problems

2010-06-08 Thread Dejan Muhamedagic
Hi, On Tue, Jun 08, 2010 at 06:43:11PM +0200, Dalibor Dukic wrote: > On Sat, 2010-06-05 at 15:36 +0200, Dalibor Dukic wrote: > > I have problem with ping RA not correctly updating CIB with appropriate > > attributes when doing fresh start. So afterwards IPaddr2 resources wont > > start. > > Have

Re: [Pacemaker] pingd problems

2010-06-08 Thread Dalibor Dukic
On Sat, 2010-06-05 at 15:36 +0200, Dalibor Dukic wrote: > I have problem with ping RA not correctly updating CIB with appropriate > attributes when doing fresh start. So afterwards IPaddr2 resources wont > start. Have anyone had chance to get a peek at this? My setup consists from two nodes doi

Re: [Pacemaker] Cluster split brain on vmware VSphere

2010-06-08 Thread Dejan Muhamedagic
Hi, On Mon, Jun 07, 2010 at 02:57:57PM +0200, Torresani, Roberto wrote: > Sorry for have choosen the wrong ml... That's no problem. There's just better chance of getting help on the other list. > Here the corosync.conf used by one cluster, the other one is > just the same provided by the epel r

Re: [Pacemaker] Service failback issue with SLES11 and HAE 11

2010-06-08 Thread Dejan Muhamedagic
Hi, On Tue, Jun 08, 2010 at 10:00:37AM +0800, ben180 wrote: > Dear all, > > There are two nodes in my customer's environment. We installed SuSE > Linux Enterprise Server 11 and HAE on the two node. The cluster is for > oracle database service HA purpose. > We have set clone resource for pingd, an

Re: [Pacemaker] Cluster frozen after "crm resource cleanup"

2010-06-08 Thread Dejan Muhamedagic
Hi, On Tue, Jun 08, 2010 at 05:28:20PM +0200, Stefan Foerster wrote: > This morning, I wanted to do a "cleanup" on a "ping" resource (which > at the time was in a "started" state but had a fail-count of 3. After > that operation, the cluster didn't do any monitor operations and > refused to do any

[Pacemaker] Cluster frozen after "crm resource cleanup"

2010-06-08 Thread Stefan Foerster
This morning, I wanted to do a "cleanup" on a "ping" resource (which at the time was in a "started" state but had a fail-count of 3. After that operation, the cluster didn't do any monitor operations and refused to do anything else. Below is a short excerpt of the messages file from the first node,