We patched and rebooted one of our clusters this morning - I verified
that pacemaker is the same as previous, plus it matches another similar
cluster.
There is a resource in the cluster defined as:
primitive re-named-reload ocf:heartbeat:anything \
params binfile="/usr/sbin/rndc" cmdli
On 6/5/13 2:30 PM, paul wrote:
Hi. I have followed the Clusters from scratch PDF and have a working two
node active passive cluster with ClusterIP, WebDataClone,WebFS and
WebSite working. I am using BIND DNS to direct my websites to the
cluster address. When I perform a failover which works ok I
Trying to commit a change to a group that looks like this:
group gr-ns-auth-ip re-auth6-ns1-ip re-auth6-ns2-ip re-auth-ns1-ip
re-auth-ns2-ip re-ns1auth-ip re-ns2auth-ip re-ns3auth-ip \
meta ordered="false" collocated="false"
"/tmp/tmpBQyloG.pcmk" 118L, 4752C written
On 4/25/13 7:43 PM, Andrew Beekhof wrote:
I certainly hope so :)
So I should complain to our sales people about this BZ before we upgrade
our clusters to 6.4?
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailm
On 4/19/13 5:48 AM, T. wrote:
When a server gets active, it will get the cluster-ip "10.20.10.70" and
the default route to "10.20.10.1".
Why can't both your cluster nodes have 10.20.10.1 as their default route
all the time?
Your configuration seems to have way too many moving parts and since
On Apr 15, 2013, at 1:59 PM, "T." wrote:
>
>
> For the access-network I use a different NIC, the nodes are in different
> networks, NodeA has 10.20.11.70, NodeB has 10.20.12.70 and I have
> configured a cluster-ip, the active node gets, (10.20.10.70).
Are they really on different networks? Wh
On 4/9/13 7:18 PM, Andrew Beekhof wrote:
Pacemaker is not supported in 6.3 and all I am allowed to say at this point[1]
is that your configuration isn't supportable for 6.4
Not because you've configured anything wrong/badly, but because a specific
application is not present.
However, if I can
On 4/7/13 10:29 PM, Andrew Beekhof wrote:
Really really weird, I've got nothing :(
We've added SPANs on the switches for the two boxes in the cluster, so
we can hopefully at least identify that the ARP frame didn't come from
them. Of course, we've not had an occurrence of it in almost a month
On 4/8/13 7:37 AM, Vadym Chepkov wrote:
What if a clustered volume group appears only when pacemaker establishes iSCSI
connection?
just make sure you activate the VG before trying to mount anything.
___
Pacemaker mailing list: Pacemaker@oss.cluste
On 4/8/13 6:42 AM, Yuriy Demchenko wrote:
The purpose of my cluster is to provide HA VM and routing/gateway
(thus RHCS isn't an option for me - no IPaddr2 and Route resources).
But I cannot find any documentation how to use cLVM in cman+pacemaker
cluster, everything I found requires use of "o
On 3/18/13 5:24 PM, Andrew Beekhof wrote:
So:
1. the IP moved from 01 to 02
2. 01 was then rebooted
3. a long time passes
4. 01 starts arping for the IP
Is that what you're saying?
Is the problem transient or does it persist?
Went like this - IP movements are all by Pacemaker/IPaddr resource
On Mon, Mar 18, 2013 at 3:17 AM, David Coulson wrote:
>> First off, I'm going to preface this with the realization that what I am
>> explaining makes no sense, doesn't follow normal logic and I'm not a
>> complete idiot. I've beaten my head against a wall with
On Mar 11, 2013, at 7:32 PM, Andrew Beekhof wrote:
>
>
> In fact prior to 6.4, Pacemaker only had Tech Preview status - using
> the CMAN plugin instead of our home grown one was key to that
> changing.
Is Pacemaker not tech preview in 6.4 anymore? What is the support status of
Pacemaker on 6.4?
What is the specific error you get from Apache? Does it not start, or does it
just not work properly?
How are you ensuring your two nodes have the same apache configuration?
David
On Mar 17, 2013, at 8:13 PM, Luis Daniel Lucio Quiroz wrote:
> strange,
>
> i have 2 hosts in a cluster
>
> clus
First off, I'm going to preface this with the realization that what I am
explaining makes no sense, doesn't follow normal logic and I'm not a
complete idiot. I've beaten my head against a wall with this issue for
two days, and have made no progress, yet we've had a couple of
production system o
On 3/3/13 1:00 PM, Lars Marowsky-Bree wrote:
My memory may be very faulty, but I thought this didn't lead to the
failure actually be cleaned up automatically, but "merely" ignored
post-timeout.
Perhaps 'clean up' is the wrong phrase. But I've absolutely seen it
remove the failure out of 'crm_mo
On 3/2/13 8:22 AM, Lars Marowsky-Bree wrote:
Unless it annoys you, this is actually harmless.
Otherwise, params first is what I tend to use.
Regards,
Lars
We've seen instances where failure-timeout is set, but Pacemaker never
seems to clean up the failure. First thought was it didn't a
Running Pacemaker 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
I noticed we have inconsistent ordering of meta/params in our
configuration - For some resources, meta comes before params, in some
cases after. In the case below, both. I am assuming meta before params
is the correct way t
On 12/19/12 5:06 AM, James Harper wrote:
What is the best way on bootup in the above situation to ensure time
synchronisation? Is it as simple as having a cron job to reset the hardware
clock every so often so that on reboot things are reasonable?
At least RHEL and SuSE can do an explicit ntp
to other server etc... Is it something wrong with my configuration or that's
> the way it's working?
>
> Thanks and regards
>
>
> On Fri, Nov 30, 2012 at 1:36 PM, David Coulson wrote:
> I would add HA to your existing HA config - The primary issue you have right
grow up till 3000 in the
next two months. Servers have Tomcat installed on them, so basically I
need to load balance connections from outside to the Tomcat.
Regards
On Fri, Nov 30, 2012 at 1:19 PM, David Coulson <mailto:da...@davidcoulson.net>> wrote:
All the connections, from ho
All the connections, from how many clients?
You might be better off using LVS for this.
David
On 11/30/12 7:15 AM, Ratko Dodevski wrote:
Hi guys, I need some help on configuring NLB for my application
servers. I've installed tomcat on 4 servers and I've decided to use
Linux HA for Network Loa
On 9/13/12 7:33 AM, ecfgijn wrote:
Hi All ,
I have configure active/active clustering in centos-6.2. But when i
try to mount gfs2 file system i am getting an error , which is
mentioned below
[root@node1 ~]# mount /dev/sdb1 /mnt/
gfs_controld join connect error: Connection refused
error moun
On 8/13/12 8:01 PM, Andrew Beekhof wrote:
You might be experiencing:
+ David Vossel (5 months ago) 9263480: Low: pengine: cl#5025 -
Automatically clear failures when resource configuration changes.
But if you send us a crm_report tarball coving the period during which
you had problems, we can
I'm running RHEL6 with the tech preview of pacemaker it ships with. I've
a number of resources which have a failure-timeout="60", which most of
the time does what it is supposed to.
Last night a resource failed, which was part of a clone - While the
resource recovered, the fail-count log never
We run many RHEL6 clusters using cman/corosync/pacemaker with SELinux
enabled. Doubt that is the problem.
The original poster wasn't using cman, but I'm not sure that makes a
substantial difference.
On 7/23/12 7:15 AM, Vladislav Bogdanov wrote:
23.07.2012 08:06, David Barchas wrote:
Hello.
Use ping to set an attribute, then add a location.
primitive re-ping-core ocf:pacemaker:ping \
meta failure-timeout="60" \
params name="at-ping-core" host_list="10.250.52.1"
multiplier="100" attempts="5" \
op monitor interval="20" timeout="15" \
op start interval
I've a couple of cloned resources which need to be restarted one at a
time as part of a batch process.
If I do a 'crm -w resource restart cl-whatever', it restarts the whole
lot at once. I can do a 'service appname stop' on each box, wait for
pacemaker to notice it is down, then let it restart
On 6/14/12 8:28 PM, Andrew Beekhof wrote:
Just use a colocation set instead.
Is there a better option than a non-ordered,non-collocated group when
you need a order dependency? We have a couple of clone resources, which
are dependent on a non-collocated group (the resources in the group are
di
Pacemaker needs to be able to monitor on all nodes. Maybe if you install
drbd on the third node but don't configure anything monitor will
correctly report it is not running over there, and your location rules
will stop it from even trying.
Or just change the RA for DRBD to report not running i
If you are running two nodes, you need to tell pacemaker you don't care
if it can't get quorum, by only having 1 of 2 nodes available. Neither
node which take over in this event know if there is split brain or not,
so you will need to make sure you have sufficient infrastructure between
the two
Can you post your pacemaker config on pastebin?
David
On Jun 4, 2012, at 3:51 PM, Cliff Massey wrote:
>
> I am trying to setup a cluster consisting of KVM DRBD and pacemaker. Without
> pacemaker DRBD and KVM are working. I can even stop everything on one node,
> promote the other to drbd pri
Cloning IPAddr2 resources utilizes the iptables CLUSTERIP rule. Probably a good
idea to start looking at it w/ tcpdump and seeing if either box gets the icmp
echo-request packet (from a ping) and determining if it just doesn't respond
properly, doesn't get it at all, or something else.
I'd say
If clvmd hangs, you probably don't have fencing configured properly - It will
block IO until a node is fenced correctly.
On May 12, 2012, at 12:16 PM, Frank Van Damme wrote:
> Hi list,
>
> I'm assembling a cluster for A/P nfs on Debian Squeeze (6.0). For
> flexibility I want with LVM. So the
What application is running on the nodes?
Sent from my iPad
On May 9, 2012, at 3:10 PM, Paul Damken wrote:
> Hello,
>
> I wonder if someone can light me on how to handle the following cluster scene:
>
> 2 Nodes Cluster (Active/Active)
> 1 Cluster managed VIP - RoundRobin ?
> SAN Shared Stor
Why not run two separate clusters - One for VMs, one for DRBD.
You can create a group containing the resources and have the location
constraint reference the group - You probably want to set the group to
'ordered=false' and 'collocated=false'. That said, if you split your
environment into two c
Shutdown pacemaker and fix your drbd disk first. Get them both
uptodate/uptodate and make sure you can manually switch them to primary
on each node.
Node2 can't become primary when it's not connected to something with an
uptodate disk.
On 3/24/12 3:15 PM, Andrew Martin wrote:
Hi Andreas,
M
Did dnsmasq log that it is listening on the cluster address? You could try
adding an iptables nat rule to the box and see if that works. Nat the cluster
address for port 53 to the local server ip.
Sent from my iPad
On Mar 23, 2012, at 9:35 PM, Gregg Stock wrote:
> I'm have some "interesting"
On 3/22/12 5:09 AM, Ante Karamatic wrote:
Hi
I've came across an odd behavior, which might be considered as
inconsistent. As we know, pacemaker doesn't allow deleting a resource
that's running, but this doesn't produce same behavior every time.
Let's take a VM with a default stop timeout (90
Are you running 'real' RHEL6?
If so, cman + clvmd + gfs2 is the way to go. You can run pacemaker on
top of all of that (without openais) to manage your resources if you
don't want to use rgmanager.
I've never tried to run clvmd out of pacemaker, but there is a init.d
script for it in RHEL6,
On 2/27/12 5:44 PM, Andrew Beekhof wrote:
On Tue, Feb 28, 2012 at 7:45 AM, David Coulson wrote:
Yep.
Is that not from the EPEL repos though?
I didn't think we shipped it (since I've had people complain to me about that)
Oops. You are right. It was added to our RHN Satellite s
Yep.
# rpm -qi ldirectord
Name: ldirectord Relocations: (not relocatable)
Version : 3.0.7 Vendor: Red Hat, Inc.
Release : 5.el6 Build Date: Thu 25 Feb 2010
04:19:10 AM EST
Install Date: Sat 22 Oct 2011 10:17:39
On 2/18/12 4:33 PM, Florian Haas wrote:
Is setting "meta collocated=false" not working for your group?
Along similar lines, if I have default-resource-stickiness="200" set,
what is the best way to 'rebalance' resources following a node failure?
In general, if I lose a node I don't want resource
On 2/18/12 4:33 PM, Florian Haas wrote:
Is setting "meta collocated=false" not working for your group?
Yep, I found that option shortly after posting my email question. Need
to try it in production tomorrow morning, but it worked in my dev
environment with dummy resources.
Thanks-
David
I have an active/active LVS cluster, which uses pacemaker for managing
IP resources. Currently I have one environment running on it which
utilizes ~30 IP addresses, so a group was created so all resources could
be stopped/started together. Downside of that is that all resources have
to run on t
On 8/10/11 11:43 AM, Marco van Putten wrote:
Thanks Andreas. But our managers persist on using Redhat.
I think the idea would be to take the HA packages distributed with
Scientific Linux 6.x and run them on RHEL.
Note that even when you do subscribe to the HA add-on in RHEL6,
pacemaker is
On 5/11/11 8:07 AM, Karl Rößmann wrote:
we have a three node cluster with a Cluster Volume Group vgsmet.
After powering off one Node, the Volume Group is stuck.
One of the ERROR messages is:
May 11 10:50:32 multix244 crmd: [8086]: ERROR: process_lrm_event: LRM
operation vgsmet:0_monitor_600
Pretty simple configuration - Two nodes running cman backed pacemaker. I
have three resources which are group together to support an application.
Filesystem, IP, and the app itself. My app is currently
misconfiguration, so I expect it to blow up when it tries to start.
In crm_mon, I have a con
48 matches
Mail list logo