Re: [Pacemaker] Problem on creating CIB entry in CRM - shadow cannot be created

2013-03-26 Thread Andrew Beekhof
On Wed, Mar 27, 2013 at 8:00 AM, Donna Livingstone wrote: > We are attempting to move our rhel 6.3 pacemaker/drbd environment to a rhel > 6.4 pacemaker environment > and as you can see below we cannot create a shadow CIB. crm_shadow -w also > core dumps. On 6.3 everything works. > > > Versions

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Andrew Beekhof
Give https://github.com/beekhof/pacemaker/commit/53c9122 a try On Wed, Mar 27, 2013 at 7:43 AM, Viacheslav Dubrovskyi wrote: > 26.03.2013 19:41, Andrew Beekhof пишет: Hi. I'm building a package for my distributive. Everything is built, but the package does not pass our interna

[Pacemaker] Problem on creating CIB entry in CRM - shadow cannot be created

2013-03-26 Thread Donna Livingstone
We are attempting to move our rhel 6.3 pacemaker/drbd environment to a rhel 6.4 pacemaker environment and as you can see below we cannot create a shadow CIB. crm_shadow -w also core dumps. On 6.3 everything works. Versions are given below. [root@vccstest1 ~]# crm crm(live)# cib new ills INFO

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Viacheslav Dubrovskyi
26.03.2013 19:41, Andrew Beekhof пишет: >>> Hi. >>> >>> I'm building a package for my distributive. Everything is built, but the >>> package does not pass our internal tests. I get errors like this: >>> verify-elf: ERROR: ./usr/lib/libpe_status.so.4.1.0: undefined symbol: >>> get_object_root > Was

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread Vladislav Bogdanov
Dennis Jacobfeuerborn wrote: >On 26.03.2013 06:14, Vladislav Bogdanov wrote: >> 26.03.2013 04:23, Dennis Jacobfeuerborn wrote: >>> I have now reduced the configuration further and removed LVM from >the >>> picture. Still the cluster fails when I set the master node to >standby. >>> What's interes

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-26 Thread Andrew Beekhof
On Tue, Mar 26, 2013 at 6:30 PM, Angel L. Mateo wrote: > El 25/03/13 20:50, Jacek Konieczny escribió: > >> On Mon, 25 Mar 2013 20:01:28 +0100 >> "Angel L. Mateo" wrote: quorum { provider: corosync_votequorum expected_votes: 2 two_node: 1 } >>>

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Andrew Beekhof
On Mon, Mar 25, 2013 at 10:55 PM, Viacheslav Dubrovskyi wrote: > 23.03.2013 08:27, Viacheslav Dubrovskyi пишет: >> Hi. >> >> I'm building a package for my distributive. Everything is built, but the >> package does not pass our internal tests. I get errors like this: >> verify-elf: ERROR: ./usr/lib

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread Dennis Jacobfeuerborn
On 26.03.2013 06:14, Vladislav Bogdanov wrote: 26.03.2013 04:23, Dennis Jacobfeuerborn wrote: I have now reduced the configuration further and removed LVM from the picture. Still the cluster fails when I set the master node to standby. What's interesting is that things get fixed when I issue a s

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars what timeout you recommend me Thanks a lot 2013/3/26 Lars Marowsky-Bree > On 2013-03-26T17:13:34, emmanuel segura wrote: > > > Hello Lars > > > > Because we have a vm(suse 11) cluster on a esx cluster, as datastore we > are > > using a netapp in cluster, the last night we had a net

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T17:13:34, emmanuel segura wrote: > Hello Lars > > Because we have a vm(suse 11) cluster on a esx cluster, as datastore we are > using a netapp in cluster, the last night we had a netapp failover, no > problem with other vm servers, but all vm in cluster with pacemaker+sbd get > has

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars Why do you think the long timeout is wrong? Do i need to change the stonith-timeout on pacemaker? Thanks 2013/3/26 Lars Marowsky-Bree > On 2013-03-26T16:48:30, emmanuel segura wrote: > > > Hello Lars > > > > So the procedura should be: > > > > crm resource stop stonith_sbd > > sbd

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars Because we have a vm(suse 11) cluster on a esx cluster, as datastore we are using a netapp in cluster, the last night we had a netapp failover, no problem with other vm servers, but all vm in cluster with pacemaker+sbd get has rebooted This beacuse the watchdog time is 5 seconds Thank

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T16:48:30, emmanuel segura wrote: > Hello Lars > > So the procedura should be: > > crm resource stop stonith_sbd > sbd -d /dev/sda1 message exit = (on every node) > sbd -d /dev/sda1 -1 90 -4 180 create > crm resource start stonith_sbd Yes. But I wonder why you need such a long tim

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars So the procedura should be: crm resource stop stonith_sbd sbd -d /dev/sda1 message exit = (on every node) sbd -d /dev/sda1 -1 90 -4 180 create crm resource start stonith_sbd Thanks 2013/3/26 Lars Marowsky-Bree > On 2013-03-26T15:56:48, emmanuel segura wrote: > > > Hello List > >

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T15:56:48, emmanuel segura wrote: > Hello List > > How can i change the sbd watchdog timeout without stopping the cluster? Very, very carefully. Stop the external/sbd resource, so that fencing blocks while you're doing this. You can then manually stop the sbd daemon on all nodes

[Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello List How can i change the sbd watchdog timeout without stopping the cluster? Thanks -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacem

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
Excellent thanks so much for the clarification. I'll drop this new RA in and see if I can get things working. STEVE On Mar 26, 2013, at 7:38 AM, Rainer Brestan mailto:rainer.bres...@gmx.net>> wrote: Hi Steve, pgsql RA does the same, it compares the last_xlog_replay_location of all nodes f

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Rainer Brestan
  Hi Steve, pgsql RA does the same, it compares the last_xlog_replay_location of all nodes for master promotion. Doing a promote as a restart instead of promote command to conserve timeline id is also on configurable option (restart_on_promote) of the current RA. And the RA is definitely capabl

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
On Mar 26, 2013, at 6:32 AM, Rainer Brestan mailto:rainer.bres...@gmx.net>> wrote: Hi Steve, when Pacemaker does promotion, it has already selected a specific node to become master. It is far too late in this state to try to update master scores. But there is another problem with xlog in Post

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
I'm guessing that you are referring to this RA https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/pgsql with additions by T. Matsuo. >From reading the "wiki" ( hopefully I have misinterpreted this :) ) on his >Github page it looks like this RA was written to work in a Active/P

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Rainer Brestan
  Hi Steve, when Pacemaker does promotion, it has already selected a specific node to become master. It is far too late in this state to try to update master scores.   But there is another problem with xlog in PostgreSQL.   According to some discussion on PostgreSQL mailing lists, not releva

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread emmanuel segura
Hello Dennis This constrain is wrong colocation c_web1_on_drbd inf: ms_drbd_web1:Master p_fs_web1 it should be colocation c_web1_on_drbd inf: p_fs_web1 ms_drbd_web1:Master Thanks 2013/3/26 Dennis Jacobfeuerborn > I have now reduced the configuration further and removed LVM from the > pictur

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-26 Thread Angel L. Mateo
El 25/03/13 20:50, Jacek Konieczny escribió: On Mon, 25 Mar 2013 20:01:28 +0100 "Angel L. Mateo" wrote: quorum { provider: corosync_votequorum expected_votes: 2 two_node: 1 } Corosync will then manage quorum for the two-node cluster and Pacemaker I'm using corosync