Re: [Pacemaker] very slow pacemaker/corosync shutdown

2013-09-19 Thread Vladislav Bogdanov
20.09.2013 02:52, Andrew Beekhof wrote: > > On 19/09/2013, at 7:45 PM, David Lang wrote: > >> On Thu, 19 Sep 2013, Florian Crouzat wrote: >> >>> Le 19/09/2013 00:25, David Lang a ?crit : I'm frequently running into a problem that shutting down pacemaker/corosync takes a very long time

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Aarti Sawant
when i try to install corosync i get seg fault error when i am trying to start corosync i get error of segfault -bash-4.1# service corosync start Starting Corosync Cluster Engine (corosync): /etc/init.d/corosync: line 85: 610 Segmentation fault (core dumped) $prog > /dev/null 2>&1

Re: [Pacemaker] [Openais] very slow pacemaker/corosync shutdown

2013-09-19 Thread Andrew Beekhof
On 20/09/2013, at 10:46 AM, Lists wrote: > On 09/19/2013 04:50 PM, Andrew Beekhof wrote: >> From this we can infer that corosync has gotten horribly confused and, as a >> consequence, pacemaker can't talk to its peers anymore. >> >>> >this is a test cluster and not being monitored by a netmon.

Re: [Pacemaker] [Openais] very slow pacemaker/corosync shutdown

2013-09-19 Thread Lists
On 09/19/2013 04:50 PM, Andrew Beekhof wrote: From this we can infer that corosync has gotten horribly confused and, as a consequence, pacemaker can't talk to its peers anymore. >this is a test cluster and not being monitored by a netmon. Any other details I could provide that would be usefu

Re: [Pacemaker] Monitoring on master node not running after standby is connected

2013-09-19 Thread Juraj Fabo
Juraj Fabo writes: > > Dear all > > Attached is my 2-nodes, master slave cluster configuration with master-slave > postgresql resource and some IP resources. > I've modified pgsql resource agent to log its "main" entry with the > parameter to see what operation is called. > My problem is that w

[Pacemaker] Monitoring - pacemaker

2013-09-19 Thread Denise Cosso
Hi,     I have a cluster (2 machines) email using the pacemaker / corosync as Active / Passive.     Already configured filesystem (, ocf: heartbeat: Filesystem) SFEX, ping (ocf: pacemaker: ping) and start / stop imap and saslauthd. I put everything in one group.    What is missing is monitor

[Pacemaker] Monitoring on master node not running after standby is connected

2013-09-19 Thread Juraj Fabo
Dear all Attached is my 2-nodes, master slave cluster configuration with master-slave postgresql resource and some IP resources. I've modified pgsql resource agent to log its "main" entry with the parameter to see what operation is called. My problem is that while the single node is running, the m

Re: [Pacemaker] [Openais] very slow pacemaker/corosync shutdown

2013-09-19 Thread Lists
On 09/18/2013 06:49 PM, Andrew Beekhof wrote: On 19/09/2013, at 8:25 AM, David Lang wrote: What's the best way to see what it's getting stuck doing? Log files. Is there a good way to tell if this is a pacemaker or corosync problem (so I can drop one of the lists from the thread)? Not with

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 11:07 PM, Aarti Sawant wrote: > hello, > > i am running corosync plugin and not using cman. then contrary to what digimer said, you do need to start corosync yourself. > > > On Thu, Sep 19, 2013 at 4:22 PM, Andrew Beekhof wrote: > > On 19/09/2013, at 8:17 PM, Aarti Sawan

Re: [Pacemaker] very slow pacemaker/corosync shutdown

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 7:45 PM, David Lang wrote: > On Thu, 19 Sep 2013, Florian Crouzat wrote: > >> Le 19/09/2013 00:25, David Lang a ?crit : >>> I'm frequently running into a problem that shutting down >>> pacemaker/corosync takes a very long time (several minutes) >> >> Just to be 100% sure, you

Re: [Pacemaker] [Openais] very slow pacemaker/corosync shutdown

2013-09-19 Thread Andrew Beekhof
On 20/09/2013, at 8:19 AM, Lists wrote: > On 09/18/2013 06:49 PM, Andrew Beekhof wrote: >> On 19/09/2013, at 8:25 AM, David Lang wrote: >> >>> What's the best way to see what it's getting stuck doing? >> Log files. >> >>> Is there a good way to tell if this is a pacemaker or corosync problem

Re: [Pacemaker] monitor on disabled nodes

2013-09-19 Thread Radoslaw Garbacz
On Wed, Sep 18, 2013 at 8:58 PM, Andrew Beekhof wrote: > > On 19/09/2013, at 2:13 AM, Radoslaw Garbacz > wrote: > >> Hi, >> >> I have a question regarding the "monitor" operation on disabled nodes. >> >> I noticed that this operation is called even, when an agent is disabled for >> a node. Is i

Re: [Pacemaker] create 2-node Active/Passive firewall cluster

2013-09-19 Thread David Lang
On Thu, 19 Sep 2013, Florian Crouzat wrote: Le 19/09/2013 11:43, David Lang a ?crit : I've been running active/failover firewall clusters with heartbeat since about 2000, and one suggestion that I would make. If you can leave all the daemons running all the time, the failover process is far mo

Re: [Pacemaker] create 2-node Active/Passive firewall cluster

2013-09-19 Thread Florian Crouzat
Le 19/09/2013 11:43, David Lang a écrit : On Thu, 19 Sep 2013, Florian Crouzat wrote: Le 18/09/2013 20:34, Jeff Weber a ?crit : I am looking to create a 2-node Active/Passive firewall cluster. I am an experienced Linux user, but new to HA clusters. I have scanned "Clusters From Scratch" and

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Aarti Sawant
hello, i am running corosync plugin and not using cman. On Thu, Sep 19, 2013 at 4:22 PM, Andrew Beekhof wrote: > > On 19/09/2013, at 8:17 PM, Aarti Sawant wrote: > > > hello, > > > > pacemaker starts but it corosync service is not started > > Are you using cman (cluster.conf) or the corosync

Re: [Pacemaker] Mysql multiple slaves, slaves restarting occasionally without a reason

2013-09-19 Thread Andrew Beekhof
On 10/09/2013, at 4:07 PM, Attila Megyeri wrote: > Hi, > > We have a Mysql cluster which works fine when I have a single master (“A”) > and slave (“B”). Failover is almost immediate and I am happy with this > approach. > When we configured two additional slaves, strange things start to happe

Re: [Pacemaker] Howto test/simulate the reaction of the cluster to node up and down

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 5:57 PM, Andreas Mock wrote: > Hi Lars, hi Andrew, > > thank you for your answers. > But I'm still stuck. > > When I do have both nodes online and the resources > are spread over these nodes and I do a > crm_simulate -Ls -R -d node1 > I do see nicly what would happen to the c

Re: [Pacemaker] Strange "bad permissions" issue when starting pacemaker

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 5:10 PM, Саша Александров wrote: > Hi! > > I had to add additional logging to sources and build a custom version to > figure out what is the problem. There is also ways to get tons of trace logging without recompiling: http://blog.clusterlabs.org/blog/2013/pacemaker-logg

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Andrew Beekhof
On 19/09/2013, at 8:17 PM, Aarti Sawant wrote: > hello, > > pacemaker starts but it corosync service is not started Are you using cman (cluster.conf) or the corosync plugin (corosync.conf) Pacemaker only starts corosync if cman (which is a particular way of running corosync) is being used.

Re: [Pacemaker] Solving a resource allocation problem

2013-09-19 Thread Andreas Mock
Hi Lars, that's why I wrote: The interested reader of that list does now know why I tried crm_simulate... :-) Thank you Andreas Mock -Ursprüngliche Nachricht- Von: Lars Marowsky-Bree [mailto:l...@suse.com] Gesendet: Donnerstag, 19. September 2013 12:18 An: The Pacemaker cluster resour

Re: [Pacemaker] Solving a resource allocation problem

2013-09-19 Thread Lars Marowsky-Bree
On 2013-09-19T12:12:31, Andreas Mock wrote: > For a solution where I like to push a certain resource > to the new node (this service interruption doesn't > hurt too much) while being sure that the other gets > started on the newly upcoming node I have to balance > the stickiness and negative cons

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Aarti Sawant
hello, pacemaker starts but it corosync service is not started below is a output of service pacemaker start pa -ax command to check the processe on lxc container -bash-4.1# service pacemaker start Starting Pacemaker Cluster Manager:[ OK ] -bash-4.1# ps -ax Warning: bad

Re: [Pacemaker] Solving a resource allocation problem

2013-09-19 Thread Andreas Mock
Hi Lars, no you're not missing something. I just intermixed two acceptable solutions and the way I asked for it. So, for letting the resources stay where they are, you're absolutly right. For a solution where I like to push a certain resource to the new node (this service interruption doesn't h

Re: [Pacemaker] very slow pacemaker/corosync shutdown

2013-09-19 Thread David Lang
On Thu, 19 Sep 2013, Florian Crouzat wrote: Le 19/09/2013 00:25, David Lang a ?crit : I'm frequently running into a problem that shutting down pacemaker/corosync takes a very long time (several minutes) Just to be 100% sure, you always respect the stop order ? Pacemaker *then* CMAN/corosync

Re: [Pacemaker] create 2-node Active/Passive firewall cluster

2013-09-19 Thread David Lang
On Thu, 19 Sep 2013, Florian Crouzat wrote: Le 18/09/2013 20:34, Jeff Weber a ?crit : I am looking to create a 2-node Active/Passive firewall cluster. I am an experienced Linux user, but new to HA clusters. I have scanned "Clusters From Scratch" and "Pacemaker Explained". I found these docs

Re: [Pacemaker] Solving a resource allocation problem

2013-09-19 Thread Lars Marowsky-Bree
On 2013-09-19T10:20:07, Andreas Mock wrote: > Hi all, > > I need a hint how to solve a resource allocation problem > on a two node cluster (pmck 1.1.11). > > I have two resource blocks (some stacked resources colocation inf) > which shall run on seperate nodes. I did this with a small negativ >

Re: [Pacemaker] Error when managing network with ping/pingd.

2013-09-19 Thread Francis SOUYRI
Hi Andreas, You are right and I am blind, the fail-over work now when the network failed, thank you very much !! I just have to set now the "best" parameters for the ping monitor. With your help I just finish to migrate with the same "services" our test cluster from hearbeat(crm 1)/drbd (v

[Pacemaker] Solving a resource allocation problem

2013-09-19 Thread Andreas Mock
Hi all, I need a hint how to solve a resource allocation problem on a two node cluster (pmck 1.1.11). I have two resource blocks (some stacked resources colocation inf) which shall run on seperate nodes. I did this with a small negativ colocation constraint. This works so far. But now I want to

Re: [Pacemaker] very slow pacemaker/corosync shutdown

2013-09-19 Thread Florian Crouzat
Le 19/09/2013 00:25, David Lang a écrit : I'm frequently running into a problem that shutting down pacemaker/corosync takes a very long time (several minutes) Just to be 100% sure, you always respect the stop order ? Pacemaker *then* CMAN/corosync ? -- Cheers, Florian Crouzat __

Re: [Pacemaker] Howto test/simulate the reaction of the cluster to node up and down

2013-09-19 Thread Andreas Mock
Hi Lars, hi Andrew, thank you for your answers. But I'm still stuck. When I do have both nodes online and the resources are spread over these nodes and I do a crm_simulate -Ls -R -d node1 I do see nicly what would happen to the cluster when the node goes down. Allocation scores and a transition s

Re: [Pacemaker] create 2-node Active/Passive firewall cluster

2013-09-19 Thread Florian Crouzat
Le 18/09/2013 20:34, Jeff Weber a écrit : I am looking to create a 2-node Active/Passive firewall cluster. I am an experienced Linux user, but new to HA clusters. I have scanned "Clusters From Scratch" and "Pacemaker Explained". I found these docs helpful, but a bit overwhelming, being new to

Re: [Pacemaker] Howto test/simulate the reaction of the cluster to node up and down

2013-09-19 Thread Lars Marowsky-Bree
On 2013-09-17T13:37:54, Andreas Mock wrote: > I have the problem that after a node rejoins the cluster some > resources are move back to that node. > Now I want to see the calculated scores to see where I do > have to adjust the stickyness to get the behaviour I like. > > I'm not sure how to us

Re: [Pacemaker] monitor on disabled nodes

2013-09-19 Thread Lars Marowsky-Bree
On 2013-09-18T12:20:08, Radoslaw Garbacz wrote: > Sorry for not being specific. > > The agent is meant to run only on a specific node (the head), and by > constraints is disabled on all other nodes. > > 'pcs constraint' reports: > Location Constraints: > Resource: dbx_nfs_head > Enabled

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Digimer
On 18/09/13 23:59, Aarti Sawant wrote: > Hello, > i am using corosync - corosync-1.4.1-15.el6_4.1.x86_64 > i am using crmsh +pacemaker+corosync > my pacemaker service starts but when i try to start corosync i get > segmentation fault error. > Thanks for replying. > > Thanks, > Aarti Sawant, > NTTD

Re: [Pacemaker] Strange "bad permissions" issue when starting pacemaker

2013-09-19 Thread Саша Александров
Hi! I had to add additional logging to sources and build a custom version to figure out what is the problem. It turned out that on the problem node root user had GID not 0 but 501. Very odd. :-) 2013/9/19 Andrew Beekhof > > On 17/09/2013, at 4:30 PM, Саша Александров wrote: > > > Andrew, > >

Re: [Pacemaker] corosync service start giving segmentation fault on centos

2013-09-19 Thread Aarti Sawant
Hello, i am using corosync - corosync-1.4.1-15.el6_4.1.x86_64 i am using crmsh +pacemaker+corosync my pacemaker service starts but when i try to start corosync i get segmentation fault error. Thanks for replying. Thanks, Aarti Sawant, NTTDATA OSS Center Pune On Thu, Sep 19, 2013 at 12:11 PM, Di