On Wed, Jan 7, 2015 at 10:21 AM, Gianluca Cecchi
wrote:
[snip]
>
> Is there any parameter inside configuration of cluster and/or resources
> that could control if "last-failure" information will be shown or not?
>
>
>
Ok, solved.
On the cluster where timestamp is sho
Hello,
I have two old SLES 11 SP2 clusters, each one composed by two nodes, that
I'm comparing.
They should have same package versions (I' going to verify better this
point and configuration files when access will be granted to me) but on one
where there is a group configured with mysqld, when the
Hello,
using pacemaker on CentOS 6.5 I would like to test the agents in subject
but I don't find them in /usr/lib/ocf/resource.d/heartbeat/ as expected
I have resource-agents-3.9.2-40.el6_5.7.x86_64 and I have enabled standard
CentOS repos and epel ones.
# yum whatprovides /usr/lib/ocf/resource.d/
On Sun, Jun 22, 2014 at 1:51 AM, Digimer wrote:
> Excellent.
>
> Please note; With IPMI-only fencing, you may find that killing all power
> to the node will cause fencing to fail, as the IPMI's BMC will lose power
> as well (unless it has it's own battery, but most don't).
>
> If you find thi
On Wed, Jun 25, 2014 at 10:57 AM, Gianluca Cecchi wrote:
>
> Tried to select "feedback" button at bottom but it doesn't work (at least
> on my chrome browser on Fedora 20) for niether the italy one not the
> english one...
>
>
Actually the Italian feedback lin
On Wed, Jun 25, 2014 at 1:28 AM, Andrew Beekhof wrote:
>
> > SO it seems at midnight the resource already was with a failcount of 2
> (perhaps caused by problems happened weeks ago..?) and then at 03:38 got a
> timeout on monitoring its state and was relocated...
> >
> > pacemaker is at 1.1.6-1.2
Hi Gianluca,
>
> I'm not sure of the CIB XML syntax, but here is how it's done using pcs:
>
OK, thanks Digimer.
It seems it worked this way using your suggestions
[root@srvmgmt01 ~]# pcs stonith show
Fencing(stonith:fence_intelmodular):Started
# pcs cluster cib stonith_separate_cfg
Hello,
I have a CentOS 6.5 based cluster with
pacemaker-1.1.10-14.el6_5.3.x86_64
cman-3.0.12.1-59.el6_5.2.x86_64
and configured pacemaker with cman integration.
The nodes are two blades inside an Intel enclosure.
At the moment my configuration has this in cluster.conf
and this if I r
Hello,
when the monitor action for a resource times out I think its failcount is
incremented by 1, correct?
If so, suppose the next monitor action succeeds, does the failcount value
automatically resets to zero or does it stay to 1?
In the last case, is there any way to configure the cluster to
aut
On Mon, Mar 17, 2014 at 12:19 AM, Alex Samad - Yieldbroker wrote:
> Hi
>
>
>
> I am in the process of migrating away from the pcmk plugin for corosync and
> converting to cman.
>
> So from what I gather its
>
> Pacemaker -> cman -> corosync
>
> Can I still configure corosync with /etc/corosync/cor
Hello,
I have some init based scripts that I configure as lsb resources.
They are java based (in this case ovirt-engine and
ovirt-websocket-proxy from oVirt project) and they are started through
the rhel "daemon" function.
Basically it needs a few seconds before the scripts exit and the
status opti
On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof wrote:
>
> On 8 Mar 2014, at 11:31 am, Gianluca Cecchi wrote:
>
>> I provoke power off of ovirteng01. Fencing agent works ok on
>> ovirteng02 and reboots it.
>> I stop boot ofovirteng01 at grub prompt to simulate prob
On Wed, Mar 12, 2014 at 12:37 AM, Andrew Beekhof wrote:
> It was put in when drbd called:
>
> fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>
> When and why it called that is not my area of expertise though.
>
The constraint put by crm-fence-peer.sh was
and I think it was g
so I fixed the problem regarding hostname in drbd.conf and in name
from cluster point of view.
ALso configured and verified fence_vmware agent and enabled stonith
Changed in drbd resource configuration
resource ovirt {
disk {
disk-flushes no;
md-flushes no;
fencing resource-and-stonith;
}
device
On Wed, Mar 5, 2014 at 9:28 AM, Anne Nicolas wrote:
> Hi
>
> I'm having trouble setting a very simple cluster with 2 nodes. After all
> reboot I'm getting split brain that I have to solve by hand then.
> Looking for a solution for that one...
>
> Both nodes have 4 network interfaces. We use 3 of t
On Mon, Mar 3, 2014 at 9:29 PM, Digimer wrote:
> Two possible problems;
>
> 1. cman's cluster.conf needs the ''.
>
> 2. You don't have fencing setup. The 'fence_pcmk' script only works if
> pacemaker's stonith is enabled and configured properly. Likewise, you will
> need to configure DRBD to use t
Hello,
I'm testing pacemaker with cman on CentOS 6.5 where I have drbd
resource in classic primary/secondary setup with master/slave config
Relevant packages:
cman-3.0.12.1-59.el6_5.1.x86_64
pacemaker-1.1.10-14.el6_5.2.x86_64
kmod-drbd84-8.4.4-1.el6.elrepo.x86_64
drbd84-utils-8.4.4-2.el6.elrepo.x8
On Sun, Nov 24, 2013 at 4:47 PM, Steven Dake wrote:
> Using a real-world example
> token: 1
> retrans_before_loss_const: 10
>
> token will be retransmitted roughly every 1000 msec and the token will be
> determined lost after 1msec.
>
OK, thank you very much for clarifying this.
I also to
On Thu, Nov 21, 2013 at 9:09 AM, Lars Marowsky-Bree wrote:
> On 2013-11-20T16:58:01, Gianluca Cecchi wrote:
>
>> Based on docs I thought that the timeout should be
>>
>> token x token_retransmits_before_loss_const
>
> No, the comments in the corosync.conf.example
On Thu, Nov 21, 2013 at 9:09 AM, Lars Marowsky-Bree wrote:
> On 2013-11-20T16:58:01, Gianluca Cecchi wrote:
>
>> Based on docs I thought that the timeout should be
>>
>> token x token_retransmits_before_loss_const
So in particular I have to ask a correction for this:
h
Hello,
trying to relax timeout because of backups that runs usign Netbackup
and VMware storage api and causing cluster reconfiguration
Based on docs I thought that the timeout should be
token x token_retransmits_before_loss_const
but actually through my tests I see that setting or not setting th
On Tue, Feb 28, 2012 at 11:23 PM, Andrew Beekhof wrote:
> On Wed, Feb 29, 2012 at 2:19 AM, Gianluca Cecchi
> wrote:
>> Hello,
>> for example at http://www.clusterlabs.org/wiki/Install
>> http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm
>&
Hello,
for example at http://www.clusterlabs.org/wiki/Install
http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm
has to be changed now into
http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
(note the 5.3 -> 5.4 too)
see also:
http://lists.fedoraproj
On Wed, Oct 6, 2010 at 4:25 PM, Shravan Mishra wrote:
> That is what I heard too, that's the reason for this question.
>
On June, inside a complex thread regarding "colocation -inf", Andrew
reported the link and also several clarifications after some questions
of mine...
See in particular:
http:
On Fri, Jun 18, 2010 at 5:25 PM, Eliot Gable wrote:
> I am trying to set up Corosync + Pacemaker on a new CentOS 5.5 x86_64
> install, but when I try to start corosync, it just says [FAILED] and does
> not provide any further information. I created the authkey using
> corosync-keygen and created
On Thu, Jun 17, 2010 at 12:28 PM, Erich Weiler wrote:
> this is a problem ... without libvirtd running all management attempts via
>> "virsh" will fail (including the VirtualDomain RA) ... why is libvirtd
>> stopping before corosync?
>>
>
> Ah, because I forgot that I have libvirtd starting and s
On Tue, Jun 15, 2010 at 4:43 PM, Andrew Beekhof wrote:
>
> >
> > But that is for 1.1 branch that is not considered as "stable"...
>
> No, existing functionality its very stable.
> Its just the new features that might have some extra corner cases
> we've not seen exercised yet.
>
> Put it this way
On Tue, Jun 15, 2010 at 2:48 PM, Vadym Chepkov wrote:
>
> On Jun 15, 2010, at 8:11 AM, Gianluca Cecchi wrote:
>
> On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof wrote:
>
>> [snip]
>>
>> Score = -inf, plus the patch, plus sequential = true (or unset).
>>
On Tue, Jun 15, 2010 at 3:36 PM, Lars Marowsky-Bree wrote:
> On 2010-06-15T16:32:12, Aleksey Zholdak wrote:
>
> [snip]
>
> > > Why is the MPIO scenario so slow?
> > These questions needs to be asked to developers mptsas (novell + hp)
>
> You should really file a service request then then with
On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof wrote:
> [snip]
>
> Score = -inf, plus the patch, plus sequential = true (or unset).
> Not sure how that looks in shell syntax though.
>
>
Which patch?
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs
2010/6/12 Julio Gómez
>
> There is the error. Thanks.
>
>
Marco was meaning about uncommenting these lines in your
/etc/apache2/apache2.conf
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
and to have this uncommented (default for rh el 5 based ht
Hello,
I tried this approach and it seems to work ootb.
But I would like to know if there could be silent drawbacks or potential
problems from a technical point of view.
Under /usr/lib/ocf/resource.d/ I create a directory named myocfdir and
inside it I put a copy of an existing RA (eg apache)
/usr
On Fri, May 28, 2010 at 8:00 AM, Andrew Beekhof wrote:
>
> > I don't know if it is the same bug...
> > In the sense that in your bug case the problem is related to:
> > - the stop operation not enforcing the order constraint
> > - the constraint is between a clone of a group and a clone of a reso
On Thu, May 27, 2010 at 5:50 PM, Steven Dake wrote:
> On 05/27/2010 08:40 AM, Diego Remolina wrote:
>
>> Is there any workaround for this? Perhaps a slightly older version of
>> the rpms? If so where do I find those?
>>
>>
> Corosync 1.2.1 doesn't have this issue apparently. With corosync 1.2.1,
On Tue, May 25, 2010 at 3:39 PM, Dejan Muhamedagic wrote:
> Hi,
> [snip]
> > So I presume the problem could be caused by the fact that the second part
> is
> > a clone and not a resource? or a bug?
> > I can eventually send the whole config.
>
> Looks like a bug to me. Clone or resource, constrain
Hello,
manual for 1.0 (and 1.1) reports this for Advisory Ordering:
On the other-hand, when score="0" is specified for a constraint, the
constraint is considered optional and only has an effect when both resources
are stopping and or starting. Any change in state by the first resource will
have no
On Thu, May 20, 2010 at 2:40 PM, Koch, Sebastian wrote:
> Thanks for your hints. I had the same issue and your tips nearly resolved
> it for me. But i got a question. I setted the default timeout and afterwards
> the pingd resource started to work as expected. I had a IPTABLES Rule
> dropping ic
Hello,
based on pacemaker 1.0.8 + corosync 1.2.2, having two network interfaces to
dedicate to cluster communication, what is better/safer at this moment:
a) only one corosync ring on top of a bond interface
b) two different rings, each one associated with one interface
?
Question based also on c
On Fri, May 14, 2010 at 11:40 PM, Ruiyuan Jiang wrote:
> Hi,
>
> I just created a testing cluster (Corosync 1.2.1, OpenAIS 1.1.2 and
> Pacemaker 1.0.8) on RHEL 5.5, x86_64. When I shutdown the primary node, the
> secondary node does not bind the cluster IP address which it supposed to.
> Why? Than
On Wed, May 12, 2010 at 10:22 PM, Gianluca Cecchi wrote:
> On Wed, May 12, 2010 at 2:27 PM, Andrew Beekhof
> wrote:
>
>> Attach the output from cibadmin -Ql when the cluster is in this state
>> and I'll take a look.
>>
>>
>>
> Ok. here we are with
On Thu, May 13, 2010 at 8:27 AM, Tim Serong wrote:
> Hi,
>
> On 5/13/2010 at 03:56 PM, Aleksey Zholdak wrote:
> > > The firewall should let through the UDP multicast traffic on
> > > ports mcastport and mcastport+1.
> >
> > As I wrote above: all interfaces in SuSEfirewall2 is set to "Internal
>
On Tue, May 11, 2010 at 5:47 PM, Vadym Chepkov wrote:
> pingd is a daemon with is running all the time and does it job
> you still need to define monitor operation though, what if the daemon dies?
> op monitor just have a different meaning for ping and pingd.
> with pingd - monitor daemon
> with
Hello,
I'm using pacemaker 1.0.8 on rh el 5.5 x86 with clusterlabs repo.
Based on other posts on linux-ha I'm trying to configure a 2-nodes cluster
where one of the nodes is nfs-server and the other one is nfs-client of the
resource exported by the first one.
The main parts borrowed form the relat
On Tue, May 11, 2010 at 1:13 PM, Vadym Chepkov wrote:
> First of all, none of the monitor operation is on by default in pacemaker,
> this is something that you have to turn on
> For the ping RA start and stop op parameters don't do much, so you can
> safely drop them.
>
>
>
Yes, but for the pace
On Tue, May 11, 2010 at 12:50 PM, Vadym Chepkov wrote:
> You forgot to turn on monitor operation for ping (actual job)
>
>
>
I saw from the
[r...@ha1 ~]# crm ra meta ping
command
Operations' defaults (advisory minimum):
start timeout=60
stop timeout=20
reload
On Tue, May 11, 2010 at 11:58 AM, Dejan Muhamedagic
wrote:
> Do you see the attribute set in the status section (cibadmin -Ql
> | grep -w pingd)? If not, then the problem is with the resource.
[r...@ha1 ~]# cibadmin -Ql | grep -w pingd
Tried to ch
nst pinggw
>
> I suggest you do not change name though, but adjust your location
> constraint to use pingd instead.
> crm_mon only notices "pingd" at the moment whenn you pass -f argument: it's
> hardcoded
>
>
> On Mon, May 10, 2010 at 9:34 AM, Gianluca Cecch
Hello,
using pacemaker 1.0.8 on rh el 5 I have some problems understanding the way
ping clone works to setup monitoring of gw... even after reading docs...
As soon as I run:
crm configure location nfs-group-with-pinggw nfs-group rule -inf:
not_defined pinggw or pinggw lte 0
the resources go stopp
48 matches
Mail list logo