On Mon, 2014-10-13 at 12:51 +1100, Andrew Beekhof wrote:
>
> Even the same address can be a problem. That brief window where things were
> getting renewed can screw up corosync.
But as I proved, there was no renewal at all during the period of this
entire pacemaker run, so the use of DHCP here i
On Wed, 2014-10-08 at 12:39 +1100, Andrew Beekhof wrote:
> On 8 Oct 2014, at 2:09 am, Brian J. Murrell (brian)
> wrote:
>
> > Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
> > and "node6" I saw an "unknown" third nod
Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
and "node6" I saw an "unknown" third node being added to the cluster,
but only on node5:
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] notice: pcmk_peer_update:
Transitional membership event on ring 12: memb=2, new=0, lost=
Hi,
As was previously discussed there is a bug in the handling of a STONITH
if a node reboots too quickly. I had a different kind of failure that I
suspect is the same kind of problem, just different symptom.
The situation is a two node cluster with two resources plus a fencing
resource. Each n
On Thu, 2014-04-10 at 10:04 +1000, Andrew Beekhof wrote:
>
> Brian: the detective work above is highly appreciated
NP. I feel like I am getting better at reading these logs and can
provide some more detailed dissection of them. And am happy to do so to
help get to the bottom of things. :-)
>
On Tue, 2014-04-08 at 17:29 -0400, Digimer wrote:
> Looks like your fencing (stonith) failed.
Where? If I'm reading the logs correctly, it looks like stonith worked.
Here's the stonith:
Apr 8 09:53:21 lotus-4vm6 stonith-ng[2492]: notice: log_operation: Operation
'reboot' [3306] (call 2 from
more detail is needed, then I'll be happy to provide it.
[1] http://pastebin.com/raw.php?i=3ThD1uM7
[2] http://pastebin.com/raw.php?i=5F9142SF
[3] http://pastebin.com/raw.php?i=LA4E0vUS
[4] http://pastebin.com/raw.php?i=6BpB5L4u
--
-Andrew J. Caines- Unix Systems Engineer a.j.cai...@halpla
On Thu, 2014-02-06 at 10:42 -0500, Brian J. Murrell (brian) wrote:
> On Wed, 2014-01-08 at 13:30 +1100, Andrew Beekhof wrote:
> > What version of pacemaker?
>
> Most recently I have been seeing this in 1.1.10 as shipped by RHEL6.5.
Doh! Somebody did a test run that had not been
On Wed, 2014-01-08 at 13:30 +1100, Andrew Beekhof wrote:
> What version of pacemaker?
Most recently I have been seeing this in 1.1.10 as shipped by RHEL6.5.
> On 10 Dec 2013, at 4:40 am, Brian J. Murrell
> wrote:
>
I didn't seem to get a response to any of the below questio
On Thu, 2014-01-16 at 14:49 +1100, Andrew Beekhof wrote:
>
> What crm_mon are you looking at?
> I see stuff like:
>
> virt-fencing (stonith:fence_xvm):Started rhos4-node3
> Resource Group: mysql-group
> mysql-vip(ocf::heartbeat:IPaddr2): Started rhos4-node3
> mysql
On Thu, 2014-01-16 at 08:35 +1100, Andrew Beekhof wrote:
>
> I know, I was giving you another example of when the cib is not completely
> up-to-date with reality.
Yeah, I understood that. I was just countering with why that example is
actually more acceptable.
> It may very well be partially s
On Wed, 2014-01-15 at 17:11 +1100, Andrew Beekhof wrote:
>
> Consider any long running action, such as starting a database.
> We do not update the CIB until after actions have completed, so there can and
> will be times when the status section is out of date to one degree or another.
But that is
On Tue, 2014-01-14 at 16:01 +1100, Andrew Beekhof wrote:
>
> > On Tue, 2014-01-14 at 08:09 +1100, Andrew Beekhof wrote:
> >>
> >> The local cib hasn't caught up yet by the looks of it.
I should have asked in my previous message: is this entirely an artifact
of having just restarted or are there
On Tue, 2014-01-14 at 08:09 +1100, Andrew Beekhof wrote:
>
> The local cib hasn't caught up yet by the looks of it.
Should crm_resource actually be [mis-]reporting as if it were
knowledgeable when it's not though? IOW is this expected behaviour or
should it be considered a bug? Should I open a
Hi,
I found a situation using pacemaker 1.1.10 on RHEL6.5 where the output
of "crm_resource -L" is not trust-able, shortly after a node is booted.
Here is the output from crm_resource -L on one of the nodes in a two
node cluster (the one that was not rebooted):
st-fencing (stonith:fence_foo
On Tue, 2013-12-17 at 16:33 +0100, Florian Crouzat wrote:
>
> Is it possible that lotus-5vm8 (from DNS) and lotus-5vm8-ring1 (from
> /etc/hosts) resolves to the same IP (10.128.0.206) which could maybe
> confuse cman and make it decide that there is only one ring ?
No, they do resolve to two d
So, I was reading:
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s2-rrp-ccs-CA.html
about adding a second ring to one's CMAN configuration. I tried to add
a second ring to my configuration without success.
Given the example:
# ccs -h
So, trying to create a cluster on a given node with ccs:
# ccs -p xxx -h $(hostname) --createcluster foo2
Validation Failure, unable to modify configuration file (use -i to ignore this
error).
But there shouldn't be any configuration here yet. I've not done
anything with this node:
# ccs -p xx
On Tue, 2013-12-10 at 10:27 +, Christine Caulfield wrote:
>
> Sadly you're not wrong.
That's what I was afraid of.
> But it's actually no worse than updating
> corosync.conf manually,
I think it is...
> in fact it's pretty much the same thing,
Not really. Updating corosync.conf on any
So, I'm trying to wrap my head around this need to migrate to pacemaker
+CMAN. I've been looking at
http://clusterlabs.org/quickstart-redhat.html and
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/
It seems "ccs" is the tool to configure
On Mon, 2013-12-09 at 09:28 +0100, Jan Friesse wrote:
>
> Error 6 error means "try again". This is happening ether if corosync is
> overloaded or creating new membership. Please take a look to
> /var/log/cluster/corosync.log if you see something strange there (+ make
> sure you have newest corosyn
I seem to have another instance where pacemaker fails to exit at the end
of a shutdown. Here's the log from the start of the "service pacemaker
stop":
Dec 3 13:00:39 wtm-60vm8 crmd[14076]: notice: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCES
[ Hopefully this doesn't cause a duplicate post but my first attempt
returned an error. ]
Using pacemaker 1.1.10 (but I think this issue is more general than that
release), I want to enforce a policy that once a node fails, no
resources can be started/run on it until the user permits it.
I have b
On Tue, 2013-12-03 at 18:26 -0500, David Vossel wrote:
>
> We did away with all of the policy engine logic involved with trying to move
> fencing devices off of the target node before executing the fencing action.
> Behind the scenes all fencing devices are now essentially clones. If the
> t
So, I'm migrating my working pacemaker configuration from 1.1.7 to
1.1.10 and am finding what appears to be a new behavior in 1.1.10.
If a given node is running a fencing resource and that node goes AWOL,
it needs to be fenced (of course). But any other node trying to take
over the fencing resour
On 13-07-08 03:48 AM, Andreas Mock wrote:
Hi all,
I'm just wondering what the best way is to
let an admin know that the cluster (rest of
a cluster) has stonithed some other nodes?
You could modify or even just wrap the stonith agent. They are usually
just python or shell script anyway (well,
On 13-05-22 07:05 PM, Andrew Beekhof wrote:
>
> Also, 1.1.8-7 was not tested with the plugin _at_all_ (and neither will
> future RHEL builds).
Was 1.1.7-* in EL 6.3 tested with the plugin? Is staying with most
recent EL 6.3 pacemaker-1.1.7 release really the more stable option for
people not a
Using pacemaker 1.1.8-7 on EL6, I got the following series of events
trying to shut down pacemaker and then corosync. The corosync shutdown
(service corosync stop) ended up spinning/hanging indefinitely (~7hrs
now). The events, including a:
May 21 23:47:18 node1 crmd[17598]:error: do_exit: C
Using Pacemaker 1.1.8 on EL6.4 with the pacemaker plugin, I'm finding
strange behavior with "stonith-admin -B node2". It seems to shut the
node down but not start it back up and ends up reporting a timer
expired:
# stonith_admin -B node2
Command failed: Timer expired
The pacemaker log for the op
On 13-05-09 09:53 PM, Andrew Beekhof wrote:
>
> May 7 02:36:16 node1 crmd[16836]: info: delete_resource: Removing
> resource testfs-resource1 for 18002_crm_resource (internal) on node1
> May 7 02:36:16 node1 lrmd: [16833]: info: flush_op: process for operation
> monitor[8] on ocf::Target::
Using Pacemaker 1.1.7 on EL6.3, I'm getting an intermittent recurrence
of a situation where I add a resource and start it and it seems to
start but then right away fail. i.e.
# clean up resource before trying to start, just to make sure we start with a
clean slate
# crm resource cleanup testfs-r
Using 1.1.8 on EL6.4, I am seeing this sort of thing:
pengine[1590]: warning: unpack_rsc_op: Processing failed op monitor for
my_resource on node1: unknown error (1)
The full log from the point of adding the resource until the errors:
Apr 30 11:46:30 node1 cibadmin[3380]: notice: crm_log_arg
On 13-04-30 11:13 AM, Lars Marowsky-Bree wrote:
>
> Pacemaker 1.1.8's stonith/fencing subsystem directly ties into the CIB,
> and will complete the fencing request even if the fencing/stonith
> resource is not instantiated on the node yet.
But clearly that's not happening here.
> (There's a bug
I'm using pacemaker 1.1.8 and I don't see stonith resources moving away
from AWOL hosts like I thought I did with 1.1.7. So I guess the first
thing to do is clear up what is supposed to happen.
If I have a single stonith resource for a cluster and it's running on
node A and then node A goes AWOL,
On 13-04-24 01:16 AM, Andrew Beekhof wrote:
>
> Almost certainly you are hitting:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=951340
Yup. The patch posted there fixed it.
> I am doing my best to convince people that make decisions that this is worthy
> of an update before 6.5.
I've a
Using pacemaker 1.1.8 on RHEL 6.4, I did a test where I just killed
(-KILL) corosync on a peer node. Pacemaker seemed to take a long time
to transition to stonithing it though after noticing it was AWOL:
Apr 23 19:05:20 node2 corosync[1324]: [TOTEM ] A processor failed, forming
new configurati
Given:
host1# crm node attribute host1 show foo
scope=nodes name=foo value=bar
Why doesn't this return anything:
host1# crm_attribute --node host1 --name foo --query
host1# echo $?
0
cibadmin -Q confirms the presence of the attribute:
This is on pac
On 13-04-10 07:02 PM, Andrew Beekhof wrote:
>
> On 11/04/2013, at 6:33 AM, Brian J. Murrell
> wrote:
>>
>> Does crm_resource suffer from this problem
>
> no
Excellent.
I was unable to find any comprehensive documentation on just how to
implement a pacemake
On 13-04-11 06:00 PM, Andrew Beekhof wrote:
>
> Actually, I think the semantics of -C are first-write-wins and any subsequent
> attempts fail with -EEXSIST
Indeed, you are correct. I think my point though was that it didn't
matter in my case which writer wins since they should all be trying to
On 13-04-11 07:37 AM, Brian J. Murrell wrote:
>
> In exploring all options, how about pcs? Does pcs' "resource create
> ..." for example have the same read+modify+replace problem as crm
> configure or does pcs resource create also only send proper fragments to
> u
On 13-04-10 04:33 PM, Brian J. Murrell wrote:
>
> Does crm_resource suffer from this problem or does it properly only send
> exactly the update to the CIB for the operation it's trying to achieve?
In exploring all options, how about pcs? Does pcs' "resource create
...&qu
On 13-02-21 07:48 PM, Andrew Beekhof wrote:
> On Fri, Feb 22, 2013 at 5:18 AM, Brian J. Murrell
> wrote:
>> I wonder what happens in the case of two racing "crm" commands that want
>> to update the CIB (with non-overlapping/conflicting data). Is there any
>&
so I don't know
where the reference to the old name can be saved except in the cluster.
Regarding the version, here are the details:
- Corosync 1.2.7-1.1.el5
- Pacemaker 1.1.5-1.1.el5
2013/4/1 David Vossel
> - Original Message -
> > From: "Nicolas J."
> > To: p
s.com
INFO: node VMTESTORADG2.it.dbi-services.com not found by crm_node
INFO: node VMTESTORADG2.it.dbi-services.com deleted
Thanks in advance
Best Regards,
Nicolas J.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/ma
On 13-03-25 03:50 PM, Jacek Konieczny wrote:
>
> The first node to notice that the other is unreachable will fence (kill)
> the other, making sure it is the only one operating on the shared data.
Right. But with typical two-node clusters ignoring no-quorum, because
quorum is being ignored, as so
On 13-02-25 10:30 AM, Dejan Muhamedagic wrote:
>
> Before doing replace, crmsh queries the CIB and checks if the
> epoch was modified in the meantime.
But doesn't take out a lock of any sort to prevent an update in the
meanwhile, right?
> Those operations are not
> atomic, though.
Indeed.
> Pe
On 13-02-24 07:56 PM, Andrew Beekhof wrote:
>
> Basically yes.
> Stonith is the first stage of recovery and supposed to be at least
> vaguely reliable.
> Have you figured out why fencing is so broken?
It wasn't really "broken" but was in the process of being configured
when this situation arose.
I seem to have found a situation where pacemaker (pacemaker-1.1.7-6.el6.x86_64)
refuses to stop (i.e. service pacemaker stop) on EL6.
The status of the 2 node cluster was that the node being asked to stop
(node2) was continually trying to stonith another node (node1) in the
cluster which was not r
I wonder what happens in the case of two racing "crm" commands that want
to update the CIB (with non-overlapping/conflicting data). Is there any
locking to ensure that one crm cannot overwrite the other's change?
(i.e. second one to get there has to read in the new CIB before being
able to apply h
Is there a way to return an individual property (or all properties)
and/or a rsc_default (or all) back to default values, using crm, or
otherwise?
Cheers,
b.
signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@os
I'm experimenting with asymmetric clusters and resource location
constraints.
My cluster has some resources which have to be restricted to certain
nodes and other resources which can run on any node. Given that, an
"opt-in" cluster seems the most manageable. That is, it seems easier to
create co
On 13-01-23 03:32 AM, Dan Frincu wrote:
> Hi,
Hi,
> I usually put the node in standby, which means it can no longer run
> any resources on it. Both Pacemaker and Corosync continue to run, node
> provides quorum.
But a node in standby will still be STONITHed if it goes AWOL. I put a
node in stan
OK. So you have a corosync cluster of nodes with pacemaker managing
resources on them, including (of course) STONITH.
What's the best/proper way to shut down a node, say, for maintenance
such that pacemaker doesn't go trying to "fix" that situation and
STONITHing it to try to bring it back up, et
On Wed, Oct 24, 2012 at 5:59 PM, Andrew Beekhof wrote:
> On Wed, Oct 17, 2012 at 8:30 AM, Lonni J Friedman wrote:
>> Greetings,
>> I'm trying to get an NFS server export to be correctly monitored &
>> managed by pacemaker, along with pre-existing IP, drbd and f
ted a new corosync.conf, and now the nodes are
talking again.
Sorry for the noise.
On Thu, Oct 18, 2012 at 10:25 AM, Lonni J Friedman wrote:
> Both nodes can ssh to each other, selinux is disabled, and there are
> currently no iptables rules in force. So I'm not sure why the
y one would be elected quite quickly, you may have a
> network/filewall issue.
>
> On Thu, Oct 18, 2012 at 10:37 AM, Lonni J Friedman wrote:
>> I'm running Fedora17, with pacemaker-1.18. I just tried to make a
>> configuration change with crmsh, and it failed as follows:
I'm running Fedora17, with pacemaker-1.18. I just tried to make a
configuration change with crmsh, and it failed as follows:
##
# crm configure edit
Call cib_replace failed (-62): Timer expired
ERROR: could not replace cib
INFO: offending xml:
Greetings,
I'm trying to get an NFS server export to be correctly monitored &
managed by pacemaker, along with pre-existing IP, drbd and filesystem
mounts (which are working correctly). While NFS is up on the primary
node (along with the other services), the monitoring portion keeps
showing up as
On Mon, Oct 15, 2012 at 8:51 PM, Andrew Beekhof wrote:
> On Tue, Oct 16, 2012 at 2:50 PM, Andrew Beekhof wrote:
>> On Tue, Oct 16, 2012 at 9:24 AM, Lonni J Friedman wrote:
>>> On Thu, Sep 27, 2012 at 6:24 AM, David Vossel wrote:
>>>> - Original Message -
On Thu, Sep 27, 2012 at 6:24 AM, David Vossel wrote:
> - Original Message -
>> From: "Lonni J Friedman"
>> To: pacemaker@oss.clusterlabs.org
>> Sent: Wednesday, September 26, 2012 9:44:21 PM
>> Subject: [Pacemaker] setting up NFS resources on s
On Mon, Oct 1, 2012 at 2:14 PM, Jake Smith wrote:
> - Original Message -
>> From: "Lonni J Friedman"
>> To: "The Pacemaker cluster resource manager"
>> Sent: Monday, October 1, 2012 4:31:05 PM
>> Subject: Re: [Pacemaker] failed over
quot;1.1.7-2.fc16-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
##
On Thu, Sep 27, 2012 at 3:10 PM, Lonni
Anyone know where the documentation is for 1.1.8 ? I'm looking here,
and everything seems to be months old:
http://www.clusterlabs.org/doc/
I keep seeing references to "the shell is gone from 1.1.8", but I
can't find any documentation of the impact to a sysadmin, or what the
new hotness is to rep
On Sun, Sep 30, 2012 at 7:19 AM, Andrew Beekhof wrote:
> On Fri, Sep 28, 2012 at 6:13 AM, Lonni J Friedman wrote:
>> On Thu, Sep 27, 2012 at 6:24 AM, David Vossel wrote:
>>> - Original Message -
>>>> From: "Lonni J Friedman"
>>>> To
Greetings,
I've just started playing with pacemaker/corosync on a two node setup.
At this point I'm just experimenting, and trying to get a good feel
of how things work. Eventually I'd like to start using this in a
production environment. I'm running Fedora16-x86_64 with
pacemaker-1.1.7 & corosy
On Thu, Sep 27, 2012 at 6:24 AM, David Vossel wrote:
> - Original Message -
>> From: "Lonni J Friedman"
>> To: pacemaker@oss.clusterlabs.org
>> Sent: Wednesday, September 26, 2012 9:44:21 PM
>> Subject: [Pacemaker] setting up NFS resources on s
I'm trying to setup NFS resources on Fedora16, and its not working.
After googling, I stumbled across the following discussion from about
8 months ago:
http://www.gossamer-threads.com/lists/linuxha/pacemaker/77404
Has anything changed since then, or is systemd still not supported?
thanks
___
On 12-07-04 04:27 AM, Andreas Kurz wrote:
>
> beside increasing the batch limit to a higher value ... did you also
> tune corosync totem timings?
Not yet.
But a closer look at the logs reveals a bunch of these:
Jun 28 14:56:56 node-2 corosync[30497]: [pcmk ] ERROR: send_cluster_msg_raw:
Chi
On 12-07-04 02:12 AM, Andrew Beekhof wrote:
> On Wed, Jul 4, 2012 at 10:06 AM, Brian J. Murrell
> wrote:
>>
>> Just because I reduced the number of nodes doesn't mean that I reduced
>> the parallelism any.
>
> Yes. You did. You reduced the number of "che
On 12-07-03 04:26 PM, David Vossel wrote:
>
> This is not a definite. Perhaps you are experiencing this given the
> pacemaker version you are running
Yes, that is absolutely possible and it certainly has been under
consideration throughout this process. I did also recognize however,
that I am
On 12-07-03 06:17 PM, Andrew Beekhof wrote:
>
> Even adding passive nodes multiplies the number of probe operations
> that need to be performed and loaded into the cib.
So it seems. I just would have not thought they be such a load since
from a simplistic perspective, since they are not trying t
On 12-06-27 11:30 PM, Andrew Beekhof wrote:
>
> The updates from you aren't the problem. Its the number of resource
> operations (that need to be stored in the CIB) that result from your
> changes that might be causing the problem.
Just to follow this up for anyone currently following or anyone
On 12-06-26 09:54 PM, Andrew Beekhof wrote:
>
> The DC, possibly you didn't have one at that moment in time.
It was the DC in fact. I restarted corosync on that node and the
timeouts went away. But note I "re"started, not started. It was
running at the time, just not properly, apparently.
> W
So, I have an 18 node cluster here (so a small haystack, indeed, but
still a haystack in which to try to find a needle) where a certain
set of (yet unknown, figuring that out is part of this process)
operations are pooching pacemaker. The symptom is that on one or
more nodes I get the following ki
On 12-03-30 02:35 PM, Florian Haas wrote:
>
> crm configure rsc_defaults resource-stickiness=0
>
> ... and then when resources have moved back, set it to 1000 again.
> It's really that simple. :)
That sounds racy. I am changing a parameter which has the potential to
affect the stickiness of all
In my cluster configuration, each resource can be run on one of two node
and I designate a "primary" and a "secondary" using location constraints
such as:
location FOO-primary FOO 20: bar1
location FOO-secondary FOO 10: bar2
And I also set a default stickiness to prevent auto-fail-back (i.e. to
p
On 12-03-28 10:39 AM, Florian Haas wrote:
>
> Probably because your resource agent reports OCF_SUCCESS on a probe
> operation
To be clear, is this the "status" $OP in the agent?
Cheers,
b.
signature.asc
Description: OpenPGP digital signature
___
Pac
We seem to have occasion where we find crm_resource reporting that a
resource is running on more (usually all!) nodes when we query right
after adding it:
# crm_resource -resource chalkfs-OST_3 --locate
resource chalkfs-OST_3 is running on: chalk02
resource chalkfs-OST_3 is running on
On 11-10-26 10:19 AM, Brian J. Murrell wrote:
>
> # cat /tmp/foo.xml
>
>
^^^
I figured it out. This "integer" has to be quoted. I'm thinking too
much like a programmer. :-/
Cheers,
b.
signature.asc
Description
I want to be able to run a resource on any node in an asymmetric
cluster so I tried creating a rule to run it on any node not named
"foo" since there are no nodes named foo in my cluster:
# cat /tmp/foo.xml
for the resource bar:
primitive bar stonith:fence_virsh \
params ipa
I want to create a stonith primitive and clone it for each node in my
cluster. I'm using the fence-agents virsh agent as my stonith
primitive. Currently for a single node it looks like:
primitive st-pm-node1 stonith:fence_virsh \
params ipaddr="192.168.122.1" login="xxx" passwd="xxx" por
On 11-10-18 09:40 AM, Andreas Kurz wrote:
> Hello,
Hi,
> I'd expect this to be the problem ... if you insist on using an
> unsymmetric cluster you must add a location score for each resource you
> want to be up on a node ... so add a location constraint for the fencing
> clone for each node ... o
I have a pacemaker 1.0.10 installation on rhel5 but I can't seem to
manage to get a working stonith configuration. I have tested my stonith
device manually using the stonith command and it works fine. What
doesn't seem to be happening is pacemaker/stonithd actually asking for a
stonith. In my lo
So, in another thread there was a discussion of using cibadmin to
mitigate possible concurrency issue of crm shell. I have written a test
program to test that theory and unfortunately cibadmin falls down in the
face of heavy concurrency also with errors such as:
Signon to CIB failed: connection f
On 11-09-28 10:20 AM, Dejan Muhamedagic wrote:
> Hi,
Hi,
> I'm really not sure. Need to investigate this area more.
Well, I am experimenting with cibadmin. It's certainly not as nice and
shiny as crm shell though. :-)
> cibadmin talks to the cib (the process) and cib should allow
> only one w
On 11-09-16 11:14 AM, Dejan Muhamedagic wrote:
> On Thu, Sep 08, 2011 at 03:41:42PM +0100, John Spray wrote:
>
>> * Is there another way of adding resources which would be safe when
>> run concurrently?
>
> cibadmin.
But doesn't crm use cibadmin itself and if so, shouldn't whatever
benefits of
On 11-09-25 09:21 PM, Andrew Beekhof wrote:
>
> As the error says, the resource R_10.10.10.101 doesn't exist yet.
> Put it in a tag or use -C instead of -U
Thanks much. I already replied to Tim, but the summary is that the
manpage is incorrect in two places. One is specifying the attributes
ta
On 11-09-26 03:44 AM, Tim Serong wrote:
>
> Because:
>
> 1) You need to run "cibadmin -o resources -C -x test.xml" to create the
>resource (-C creates, -U updates an existing resource).
That's what I thought/wondered but the EXAMPLES section in the manpage
is quite clear that it's asking one
Using pacemaker-1.0.10-1.4.el5 I am trying to add the "R_10.10.10.101"
IPaddr2 example resource:
from the cibadmin manpage under EXAMPLES and getting:
# cibadmin -o resources -U -x test.xml
Call cib_modify failed (-22): The object/attribute does not exist
Any ideas why?
Th
On 11-09-19 11:02 PM, Andrew Beekhof wrote:
> On Wed, Aug 24, 2011 at 6:56 AM, Brian J. Murrell
> wrote:
>>
>> 2. preventing the active node from being STONITHed when the resource
>> is moved back to it's failed-and-restored node after a failover.
>> IO
I have a need to create single node clusters with pacemaker.
Crazy you might say. It does seem crazy at first but there are two
drivers for this:
The first is testing. I want to write a single code path for
controlling the starting and stopping of resources in larger, real,
multi-node clusters
I've seen both of setting a default-resource-stickiness property (i.e.
http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1)
and a rsc_defaults option with resource-stickiness
(http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Sc
Hi All,
I am trying to configure pacemaker (1.0.10) to make a single filesystem
highly available by two nodes (please don't be distracted by the dangers
of multiply mounted filesystems and clustering filesystems, etc., as I
am absolutely clear about that -- consider that I am using a filesystem
re
On 06/14/2010 11:01 PM, Vadym Chepkov wrote:
On Mon, Jun 14, 2010 at 4:37 PM, Erich Weiler wrote:
Hi All,
We have this interesting problem I was hoping someone could shed some light
on. Basically, we have 2 servers acting as a pacemaker cluster for DRBD and
VirtualDomain (KVM) resources under
On 05/19/2010 08:59 AM, Andrew Beekhof wrote:
> Which part of
>
> "web_start_0 failed with rc=6: Preventing web from re-starting
> anywhere in the cluster"
>
> Is not clear to you?
>
> Have a look what rc=6 means:
>
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explain
On 04/23/2010 06:04 PM, Dejan Muhamedagic wrote:
...
Apr 23 05:11:11 gamma lrmd: [2663]: info: rsc:apache:4: probe
Apr 23 05:11:11 gamma IPaddr2[2678]: ERROR: Setup problem: Couldn't find
utility ip
Apr 23 05:11:11 gamma crmd: [2666]: info: process_lrm_event: LRM operation
ClusterIP_monitor_0 (
On 03/04/2010 03:37 PM, Andrew Beekhof wrote:
On Thu, Mar 4, 2010 at 2:54 PM, Dennis J. wrote:
Pacemaker pulls in hearbeat and corosync as dependency. This is what happens
on a freshly install centos 5.4 VM:
Ah, so I just imagined making that change :-(
The next round of packages wont do
On 03/03/2010 08:09 PM, Andrew Beekhof wrote:
On Wed, Mar 3, 2010 at 4:00 PM, Dennis J. wrote:
On 03/03/2010 09:24 AM, Andrew Beekhof wrote:
On Wed, Mar 3, 2010 at 1:16 AM, Angie T. Muhammad
wrote:
Hello list
I have no technical questions at the moment, just a couple of
distribution
On 03/03/2010 09:24 AM, Andrew Beekhof wrote:
On Wed, Mar 3, 2010 at 1:16 AM, Angie T. Muhammad
wrote:
Hello list
I have no technical questions at the moment, just a couple of
distribution-specific and backward compatibility questions..
1- I just wonder will Pacemaker at any time in the near
I haven't been able to find any documentation outside of the man pages to help
troubleshoot this, so I've come to the experts...
I'm attempting to setup the following:
Services: NFS and Samba
Filesystems: /mnt/media | /mnt/datusr
1 - 100 of 113 matches
Mail list logo