[Pacemaker] Stopping only / last stonith resource ... cluster will continue to function ?

2013-02-03 Thread David Morton
Afternoon, We're replacing a production cluster (2 node) with a new completely separate 2 node cluster later this week. I've removed one node already to make way (physically) for the new machines, so everything is working just fine as it is with a one node cluster only. The issue is that the monit

Re: [Pacemaker] best/proper way to shut down a node for service

2013-01-23 Thread David Morton
Indeed ... thats the correct behavior as it was still an active cluster member, it just happens to not be running any resources as its in standby. If you shutdown (gracefully) openais and its showing happily as 'offline' on the remaining node(s) then all will be well. On 24 January 2013 10:28, Br

Re: [Pacemaker] best/proper way to shut down a node for service

2013-01-23 Thread David Morton
I've asked this before, you should be able to search the question. Essentially if pacemaker is shut down gracefully the remaining nodes are happy to leave it be. Generally I standby the node and then stop openais ... I have been caught out once bringing a node back online which was in standby. The

Re: [Pacemaker] Storage Locking To Prevent Split Brain

2012-10-24 Thread David Morton
ber 2012 08:40, David Morton wrote: > We're changing our SAN shortly and I'm putting together the procedure / > config now for the shared storage. This will be based on XFS on top of > clustered LVM2 via Pacemaker. > > I've implemented the exclusive=yes directive on t

[Pacemaker] Storage Locking To Prevent Split Brain

2012-10-17 Thread David Morton
We're changing our SAN shortly and I'm putting together the procedure / config now for the shared storage. This will be based on XFS on top of clustered LVM2 via Pacemaker. I've implemented the exclusive=yes directive on the LVM resources (volume groups) but I am still able to mount on both cluste

[Pacemaker] STONITH Monitor failure ... what should happen ?

2012-09-19 Thread David Morton
Simple question for today ... If a STONITH monitor fails, what is the designed behavior ? In our scenario we are using external/ipmi STONITH resources which interact with the IBM IMM (out of band management) controller. >From what i can see at the moment its possible for the STONITH monitor to fa

Re: [Pacemaker] Clustered LVM in failover cluster

2012-09-10 Thread David Morton
e will have one OCFS2 volume so DLM and cLVM are requirements there either way. On Mon, Sep 10, 2012 at 8:44 PM, Lars Marowsky-Bree wrote: > On 2012-09-10T11:06:15, David Morton wrote: > > > Not directly Pacemaker related but a simple question for those who manage > > cluste

[Pacemaker] Clustered LVM in failover cluster

2012-09-09 Thread David Morton
Not directly Pacemaker related but a simple question for those who manage clusters ... I'm recreating our failover cluster on a new SAN which will use cLVM and XFS, I am after some clarification that the "--clustered y" directive is the correct and appropriate way to create the volume groups as th

Re: [Pacemaker] Mostly STONITH Questions / Seeking Best Practice

2012-09-05 Thread David Morton
Thanks for the feedback Lars ... more information / questions below: On Wed, Sep 5, 2012 at 9:53 PM, Lars Marowsky-Bree wrote: > On 2012-09-04T16:31:54, David Morton wrote: > > > 1) I'm planning on implementing sfex resources (a small LVM volume on the > > same volume

[Pacemaker] Mostly STONITH Questions / Seeking Best Practice

2012-09-03 Thread David Morton
Afternoon all, We have a 2 node failover cluster using IBM IMM for STONITH via the external/ipmi plugin. We have recently moved from OCFS2 to ext3 for our database filesystems due to a bug we discovered, there is only one disk we need to have available to both nodes (shared scripts, logs etc) whic

Re: [Pacemaker] Fwd: All resources bounce on failback

2011-03-21 Thread David Morton
, Pavel Levshin wrote: > > 21.03.2011 1:39, David Morton: > >> >> order DB_SHARE_FIRST_DEPOT inf: CL_OCFS2_SHARED DEPOT >> order DB_SHARE_FIRST_ESP_AUDIT inf: CL_OCFS2_SHARED ESP_AUDIT >> >> > Hmm, does not this cause the observed behaviour? Infinite score

[Pacemaker] Fwd: All resources bounce on failback

2011-03-20 Thread David Morton
Nobody out there with any experience with the below issue ?! I'm sure its something simple but can't see the wood for the tree's !! -- Forwarded message ------ From: David Morton Date: Tue, Mar 15, 2011 at 3:01 PM Subject: All resources bounce on failback To: The Pa

[Pacemaker] All resources bounce on failback

2011-03-14 Thread David Morton
The config below is behaving well and doing what I want it to do, however there is one situation where it is misbehaving after a failover (using standby for testing purposes) when resources failback to their preferred node, all of the resources in the main group (DEPOT or ESP_AUDIT) bounce on

[Pacemaker] Starting openais one legged, STONITH activates

2011-02-27 Thread David Morton
I'm pretty sure the behavior outlined below is by design (and it does make sense logically) but I am wondering if there are additional checks that can be put in place to change the behavior. Situation: - Two node cluster with IPMI STONITH configured - Both servers running but with openais / pacema

[Pacemaker] Resource clone as part of a group - errors

2011-02-16 Thread David Morton
Afternoon all ... I'm attempting to use a cloned resource (OCFS2 filesystem) in two groups to give the correct start order and to ensure its always running before more critical functions happen, each group has a preferred server to run on but may and should fail over to the other. When this happens

Re: [Pacemaker] Config Sanity Check - Postgres, OCFS2 etc

2011-02-14 Thread David Morton
tes for each of > your instances: > > primitive PGSQL_DEPOT ocf:heartbeat:pgsql \ > pgdata='/var/lib/pgsql/depot' \ > pghost='depot_vip' \ > pgdb='depot' > > and so on. > > Check out How-To page for pgsql on clusterlabs web

[Pacemaker] Config Sanity Check - Postgres, OCFS2 etc

2011-02-10 Thread David Morton
Afternoon all, We're cutting over from OpenSUSE and straight heartbeat based on ext3 (two node active passive) to SLES, Pacemaker / Corosync, and OCFS2 in a split role active/passive configuration (three databases, two on one server and one on the other which can fail over to each other). As this