Re: [Pacemaker] [corosync] active/active with Radius

2015-02-16 Thread Jan Friesse
This is really question for pacemaker list, so CCing. Regards, Honza > Hi, > > I would like Corosync to manage Radius in an active/active > configuration but I don't know how I should add this, so was wondering > if somebody could point me in the right direction. > > Thanks and kind re

Re: [Pacemaker] [Openais] Issues with a squid cluster.

2015-02-10 Thread Jan Friesse
This is really question for pacemaker list, so CCing. Regards, Honza Redeye napsal(a): > I am not certain where I should post this, hopefully someone will point me in > the right direction. > > I have a two node cluster on Ubuntu 12.04, corosync, pacemaker, and squid. > Squid is not startin

Re: [Pacemaker] [Openais] problem to delete resource

2015-02-04 Thread Jan Friesse
This is really question for pacemaker list, so CCing. Regards, Honza Vladimir Berezovski (vberezov) napsal(a): Hi , I added a new resourse like crm(live)configure# primitive p_drbd_ora ocf:linbit:drbd params drbd_resource="clusterdb_res_ora" op monitor interval="60s" but its status is F

Re: [Pacemaker] Corosync fails to start when NIC is absent

2015-01-20 Thread Jan Friesse
cally :) Regards, Honza > > Thank you, > Kostya > > On Wed, Jan 14, 2015 at 1:31 PM, Kostiantyn Ponomarenko < > konstantin.ponomare...@gmail.com> wrote: > >> Thank you. Now I am aware of it. >> >> Thank you, >> Kostya >> >>

Re: [Pacemaker] [corosync] CoroSync's UDPu transport for public IP addresses?

2015-01-19 Thread Jan Friesse
, that Pacemaker must then be configured in a way that quorum is not required. Regards, Honza It would help to install and launch corosync instantly by novices. On Fri, Jan 16, 2015 at 7:31 PM, Jan Friesse wrote: Dmitry Koterov napsal(a): such messages (for now). But, anyway, DNS

Re: [Pacemaker] [corosync] CoroSync's UDPu transport for public IP addresses?

2015-01-16 Thread Jan Friesse
o.p:80) was formed. Members joined: 1760315215 Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Members[1]: 1760315215 Jan 14 10:48:28 node1 corosync[15156]: [MAIN ] Completed service synchronization, ready to provide service. On Mon, Jan 5, 2015 at 6:45 PM, Jan Friesse wrote: Dmitry, Sure, in

Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?

2015-01-14 Thread Jan Friesse
5156]: [QB] server name: quorum > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member > {a.b.c.d} > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member > {e.f.g.h} > Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] adding new UDPU member &g

Re: [Pacemaker] Corosync fails to start when NIC is absent

2015-01-14 Thread Jan Friesse
Kostiantyn, > Honza, > > Thank you for helping me. > So, there is no defined behavior in case one of the interfaces is not in > the system? You are right. There is no defined behavior. Regards, Honza > > > Thank you, > Kostya > > On Tue, Jan 13, 201

Re: [Pacemaker] Corosync fails to start when NIC is absent

2015-01-13 Thread Jan Friesse
Kostiantyn, > According to the https://access.redhat.com/solutions/638843 , the > interface, that is defined in the corosync.conf, must be present in the > system (see at the bottom of the article, section "ROOT CAUSE"). > To confirm that I made a couple of tests. > > Here is a part of the coros

Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?

2015-01-05 Thread Jan Friesse
. Because as long as DNS is resolved, corosync works only with IP. This means, code path is exactly same with IP or with DNS. Do you have logs from corosync? Honza > > On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse wrote: > >> Dmitry, >> >> >> No, I mean

Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?

2015-01-02 Thread Jan Friesse
Dmitry, No, I meant that if you pass a domain name in ring0_addr, there are no errors in logs, corosync even seems to find nodes (based on its logs), And crm_node -l shows them, but in practice nothing really works. A verbose error message would be very helpful in such case. This sounds weird

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-16 Thread Jan Friesse
;> Great, Thank you very much. >> >> But the terrible thing for me is I'm using the package from OpenSUSE repo. >> When i turn back to CentOS repo, which store lower version, the >> Dependency problem has occurred. >> >> Anyway, thank you for your hel

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-14 Thread Jan Friesse
Honza, How do I include the patch with my CentOS package? Do I need to compile them manually? Yes. Also official CentOS version was never 1.4.5. If you are using CentOS, just use stock 1.4.1-17.1. Patch is included there. Honza TeEniGMa On Mon, Jul 14, 2014 at 3:21 PM, Jan Friesse

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-14 Thread Jan Friesse
gure the Multicast address as manual. Could you advise me the solution? Many thanks in advance. Te On Thu, Jul 10, 2014 at 6:14 PM, Jan Friesse wrote: Teerapatr, Hi Honza, As you said I use the nodename identify by hostname (which be accessed via IPv6) and the node also has the altname (whi

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-10 Thread Jan Friesse
here. > > Regards, > Te > > On Thu, Jul 10, 2014 at 2:50 PM, Jan Friesse wrote: >> Teerapatr, >> >>> OK, some problems are solved. >>> I use the incorrect hostname. >>> >>> For now, the new problem has occured. >>> >&

Re: [Pacemaker] CMAN and Pacemaker with IPv6

2014-07-10 Thread Jan Friesse
Teerapatr, > OK, some problems are solved. > I use the incorrect hostname. > > For now, the new problem has occured. > > Starting cman... Node address family does not match multicast address family > Unable to get the configuration > Node address family does not match multicast address family

Re: [Pacemaker] [Openais] unmanaged resource failed - how to get back?

2014-06-30 Thread Jan Friesse
Stefan, sending to Pacemaker list because your question seems to be not Corosync related. Regards, Honza Senftleben, Stefan (itsc) napsal(a): Hello, I set the cluster in a maintainance mode with: crm configure property maintenance-mode=true . Afterwards I did stop one resource manually, b

Re: [Pacemaker] [Openais] Filesystem vs. Master-Slave MySQL resource

2014-06-03 Thread Jan Friesse
Matej, this is really question for pacemaker mailing list. > Hello, > > I have the following setup: > > 2 nodes: db-01, db-02 > Groups of resources: > fs-01: iscsi+lvm+fs at db-01 > fs-02: iscsi+lvm+fs at db-02 > > fs-01 is for mounting data files for MySQL at db-01, fs-02 for db-02 > > MySQL

Re: [Pacemaker] auto_tie_breaker in two node cluster

2014-05-21 Thread Jan Friesse
> I am not quite understand how auto_tie_breaker works. > Say we have a cluster with 2 nodes and enabled auto_tie_breaker feature. > Each node has 2 NICs. One NIC is used for cluster communication and another > one is used for providing some services from the cluster. > So the question is how the n

Re: [Pacemaker] pacemaker not started by corosync on ubuntu 14.04

2014-05-12 Thread Jan Friesse
Vladimir, Vladimir napsal(a): Hello everyone, I'm trying to get corosync/pacemaker run on Ubuntu 14.04. In my Ubuntu 12.04 setups pacemaker was started by corosync. Actually I thought the Yes. 12.04 used corosync 1.x with pacemaker plugin. "service {...}" section in the corosync.conf is spe

Re: [Pacemaker] corosync [TOTEM ] Process pause detected for 577 ms

2014-05-05 Thread Jan Friesse
, Honza > > Thanks > Emmanuel > > > 2014-04-30 17:07 GMT+02:00 Jan Friesse : > >> Emmanuel, >> >> emmanuel segura napsal(a): >

Re: [Pacemaker] corosync [TOTEM ] Process pause detected for 577 ms

2014-04-30 Thread Jan Friesse
west 1.4.6 (if you are using cman) or 2.3.3 (if you are not using cman). Also please change your pacemaker to not use plugin (upgrade to 2.3.3 will solve it automatically, because plugins in corosync 2.x are no longer support). Regards, Honza > Thanks > > > 2014-04-30 9:

Re: [Pacemaker] corosync [TOTEM ] Process pause detected for 577 ms

2014-04-30 Thread Jan Friesse
hp blade system and the strange thing is the > fencing was not triggered :(, but it's enabled > > > 2014-04-25 18:36 GMT+02:00 emmanuel segura : > >> Hello Jan, >> >> I found this problem in two hp blade system and the strange thing is the >>

Re: [Pacemaker] corosync [TOTEM ] Process pause detected for 577 ms

2014-04-25 Thread Jan Friesse
Emanuel, emmanuel segura napsal(a): Hello List, I have this two lines in my cluster logs, somebody can help to know what this means. :: corosync [TOTEM ] Process

Re: [Pacemaker] corosync does not reflect the node status correctly

2014-03-31 Thread Jan Friesse
Michael, Michael Schwartzkopff napsal(a): Hi, we just upgraded to corosync-1.4.5-2.5 from the suse build server. On one cluster we have the problem, that corosync-objctl does not reflect the status So if I understand it correctly, you have multiple clusters and all of them was upgraded and o

Re: [Pacemaker] Trouble getting two node cluster to failover when network lost

2014-03-20 Thread Jan Friesse
Aaron Wilson napsal(a): > Stefan, thanks for the reply. > > Having two nics is not for redundancy in my case. Resources on the primary > server are being accessed from both subnets at the same time. The secondary > server is to be a failover if the server goes down or if any of the > Ethernet por

Re: [Pacemaker] Errors while compiling

2014-03-19 Thread Jan Friesse
Stephan Buchner napsal(a): > Hm, i tried recompiling all three packages (libqb, corosync and > pacemaker), using versions which have been marked stable by the gentoo > project. > > I used the following versions: libqb = 0.14.4 > corosyn

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-14 Thread Jan Friesse
; Sent: Thursday, March 13, 2014 2:27 PM >> To: The Pacemaker cluster resource manager >> Subject: Re: [Pacemaker] Pacemaker/corosync freeze >> >> >>> -Original Message- >>> From: Attila Megyeri [mailto:amegy...@minerva-soft.com] >>> Sent:

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-14 Thread Jan Friesse
essage- >>> From: Attila Megyeri [mailto:amegy...@minerva-soft.com] >>> Sent: Thursday, March 13, 2014 1:45 PM >>> To: The Pacemaker cluster resource manager; Andrew Beekhof >>> Subject: Re: [Pacemaker] Pacemaker/corosync freeze >>> >>> Hello,

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-13 Thread Jan Friesse
... Also can you please try to set debug: on in corosync.conf and paste full corosync.log then? >>> >>> I set debug to on, and did a few restarts but could not reproduce the issue >> yet - will post the logs as soon as I manage to reproduce. >>> >> >> Perfect. >> >> Another option y

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-12 Thread Jan Friesse
Attila Megyeri napsal(a): >> -Original Message- >> From: Jan Friesse [mailto:jfrie...@redhat.com] >> Sent: Wednesday, March 12, 2014 2:27 PM >> To: The Pacemaker cluster resource manager >> Subject: Re: [Pacemaker] Pacemaker/corosync freeze >> >&g

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-12 Thread Jan Friesse
Attila Megyeri napsal(a): > Hello Jan, > > Thank you very much for your help so far. > >> -Original Message- >> From: Jan Friesse [mailto:jfrie...@redhat.com] >> Sent: Wednesday, March 12, 2014 9:51 AM >> To: The Pacemaker cluster resource manager &

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-12 Thread Jan Friesse
Attila Megyeri napsal(a): > >> -Original Message- >> From: Andrew Beekhof [mailto:and...@beekhof.net] >> Sent: Tuesday, March 11, 2014 10:27 PM >> To: The Pacemaker cluster resource manager >> Subject: Re: [Pacemaker] Pacemaker/corosync freeze >> >> >> On 12 Mar 2014, at 1:54 am, Attila Me

Re: [Pacemaker] [corosync] corosync Segmentation fault.

2014-02-26 Thread Jan Friesse
Andrey Groshev napsal(a): > > > 26.02.2014, 16:11, "Jan Friesse" : >> Andrey, >> can you please give a try to patch "[PATCH] votequorum: Properly >> initialize atb and atb_string" which I've sent to ML (it should be there >> soon)? >

Re: [Pacemaker] [corosync] corosync Segmentation fault.

2014-02-26 Thread Jan Friesse
Andrey, can you please give a try to patch "[PATCH] votequorum: Properly initialize atb and atb_string" which I've sent to ML (it should be there soon)? Thanks, Honza Andrey Groshev napsal(a): > > > 26.02.2014, 12:11, "Jan Friesse" : >> Andrey, >&

Re: [Pacemaker] [corosync] corosync Segmentation fault.

2014-02-26 Thread Jan Friesse
Andrey, what version of corosync and libqb are you using? Can you please attach output from valgrind (and gdb backtrace)? Thanks, Honza Andrey Groshev napsal(a): > Hi, ALL. > Something I already confused, or after updating any package or myself > something broke, > but call corosycn killed b

Re: [Pacemaker] Multicast pitfalls? corosync [TOTEM ] Retransmit List:

2014-02-17 Thread Jan Friesse
ge/multicast_snooping 2014-02-14 9:28 GMT+01:00 Beo Banks : @jan and stefan must i set it for both bridges eth1 (br1) eth0 (br0) on the host or guest ? 2014-02-14 9:06 GMT+01:00 Jan Friesse : Beo, do you experiencing cluster split? If answer is no, then you don't need to do anything.

Re: [Pacemaker] Multicast pitfalls? corosync [TOTEM ] Retransmit List:

2014-02-14 Thread Jan Friesse
Beo, do you experiencing cluster split? If answer is no, then you don't need to do anything. Maybe network buffer is just filled. But, if answer is yes, try reduce mtu size (netmtu in configuration) to value like 1000. Regards, Honza Beo Banks napsal(a): Hi, i have a fresh 2 node cluster

Re: [Pacemaker] error: send_cpg_message: Sending message via cpg FAILED: (rc=6) Try again

2013-12-09 Thread Jan Friesse
Brian J. Murrell (brian) napsal(a): > I seem to have another instance where pacemaker fails to exit at the end > of a shutdown. Here's the log from the start of the "service pacemaker > stop": > > Dec 3 13:00:39 wtm-60vm8 crmd[14076]: notice: do_state_transition: State > transition S_POLICY_E

Re: [Pacemaker] Network outage debugging

2013-11-13 Thread Jan Friesse
Sean Lutner napsal(a): > > On Nov 13, 2013, at 3:15 AM, Jan Friesse wrote: > >> Andrew Beekhof napsal(a): >>> >>> On 13 Nov 2013, at 11:49 am, Sean Lutner wrote: >>> >>>> >>>> >>>>> On Nov 12, 2013, at 7:33 PM,

Re: [Pacemaker] Network outage debugging

2013-11-13 Thread Jan Friesse
Andrew Beekhof napsal(a): > > On 13 Nov 2013, at 11:49 am, Sean Lutner wrote: > >> >> >>> On Nov 12, 2013, at 7:33 PM, Andrew Beekhof >>> wrote: >>> >>> On 13 Nov 2013, at 11:22 am, Sean Lutner wrote: > On Nov 12, 2013, at 6:01 PM, Andrew Beekhof > wrot

Re: [Pacemaker] Simple installation Pacemaker + CMAN + fence-agents

2013-11-10 Thread Jan Friesse
Andrew Beekhof napsal(a): > Something seems very wrong with this at the corosync level. > Even fenced and the dlm are having issues. > > Jan: Could this be firewall related? Yes. This can be ether firewall on mcast issue. I would recommend to turn off firewall completely (for testing). If this do

Re: [Pacemaker] Could not initialize corosync configuration API error 2

2013-10-31 Thread Jan Friesse
Andrew, this problem was already discussed on corosync-ml. Andrew Beekhof napsal(a): > Jan: not sure if you're on the pacemaker list > > On 29 Oct 2013, at 6:43 pm, Bauer, Stefan (IZLBW Extern) > wrote: > >> Dear Developers/Users, >> >> we’re using Pacemaker 1.1.7 and Corosync Cluster Engine

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
it's internally converted to recommended version with nodelist (so that's what you've sent). Regards, Honza Mike Edwards napsal(a): > Yep. The config I pasted has the bindnetaddr set to 10.10.23.50, which > also happens to be defined as node 1. > > > On Wed, Ma

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
Mike Edwards napsal(a): > Which would be the recommended trqansport? I'm not tied to any > particular method. > As long as UDP (multicast) works for you, it's better solution (better tested, faster, ...). UDPU is targeted for deployments where multicast is problem. Regards, Honza > > On Wed

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
Mike, did you entered local node in nodelist? Because this may explain behavior you were describing. Honza Mike Edwards napsal(a): > On Tue, May 21, 2013 at 11:15:56AM +1000, Andrew Beekhof babbled thus: >> cpg_join() is returning CS_ERR_TRY_AGAIN here. >> >> Jan: Any idea why this might happen?

Re: [Pacemaker] [Openais] Hawk 0.5.2 Debian packages

2013-02-26 Thread Jan Friesse
Great news! Regards, Honza Charles Williams napsal(a): > Hey all, > > I recently got a chance to finally build Debian packages for the > 0.5.2 version of ClusterLabs Hawk GUI. These are Squeeze packages > ATM (Wheezy to come next week dependent upon testing of the > current packages) and I am

Re: [Pacemaker] [corosync] Corosync memory usage rising

2013-02-04 Thread Jan Friesse
Andrew Beekhof napsal(a): > On Thu, Jan 31, 2013 at 8:10 AM, Yves Trudeau wrote: >> Hi, >>Is there any known memory leak issue corosync 1.4.1. I have a setup here >> where corosync eats memory at a few kB a minute: 1.4.1 for sure. But it looks you are using 1.4.1-7 (EL 6.3.z), and I must say

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-08 Thread Jan Friesse
m /dev/shm and cluster should work. There are basically two problems: - ipc_shm is leaking memory - if there is no memory, libqb mmap nonallocated memory and receives sigbus Angus is working on both issues. Regards, Honza Jan Friesse napsal(a): > Andrew, > thanks for valgrind report (even

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-08 Thread Jan Friesse
covered tonight was that the 127.0.1.1 entry in /etc/hosts > (on both storage0 and storage1) was the source of the extra "localhost" entry > in the cluster. I have removed this extraneous node so now only the 3 real > nodes remain and commented out this line in /etc/hosts on all n

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-07 Thread Jan Friesse
b this time. I >>>> compiled libqb 0.14.2 for use with the cluster. This time when corosync >>>> died I noticed the following in dmesg: >>>> Nov 1 13:21:01 storage1 kernel: [31036.617236] corosync[13305] trap divide >>>> error ip:7f657a52e517 sp

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-07 Thread Jan Friesse
t;>>> >>>> I did find find something else interesting with libqb this time. I >>>> compiled libqb 0.14.2 for use with the cluster. This time when corosync >>>> died I noticed the following in dmesg: >>>> Nov 1 13:21:01 storage1 kernel: [310

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-05 Thread Jan Friesse
;>> corosync died I noticed the following in dmesg: >>>> Nov 1 13:21:01 storage1 kernel: [31036.617236] corosync[13305] trap >>>> divide error ip:7f657a52e517 sp:7fffd5068858 error:0 in >>>> libqb.so.0.14.2[7f657a525000+1f000] >>>> This error w

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Jan Friesse
Ansdrew, I was not able to find anything interesting (from corosync point of view) in configuration/logs (corosync related). What would be helpful: - if corosync died, there should be /var/lib/corosync/fdata-DATETTIME-PID of dead corosync. Can you please xz them and store somewhere (they are quiet