Re: [Pacemaker] Stonithd segfaulting and causing unclean?

2014-03-20 Thread Andrew Beekhof
On 21 Mar 2014, at 12:40 am, Michał Margula wrote: > Hello, > > We had many unresolved issues some time ago with Pacemaker. I think > almost all of them got solved by fixing link between clusters (removed > media converters, replaced them with NIC with SFP+, upgraded to 10Gbps). > > Now it see

Re: [Pacemaker] 2-node cluster with shared storage: what is current solution

2014-03-20 Thread Саша Александров
Hi! Well, since I needed one thing - only one node starts the database on shared storage - I made an ugly dirty hack :-), that seems to work for me. I wrote a custom RA, that relies on frequent 'monitor' actions, and simply writes timestamp+hostname to physical partition. In case it detects that s

[Pacemaker] Stonithd segfaulting and causing unclean?

2014-03-20 Thread Michał Margula
Hello, We had many unresolved issues some time ago with Pacemaker. I think almost all of them got solved by fixing link between clusters (removed media converters, replaced them with NIC with SFP+, upgraded to 10Gbps). Now it seems to be working fine with few exceptions: - if I kill one node man

Re: [Pacemaker] Trouble getting two node cluster to failover when network lost

2014-03-20 Thread Aaron Wilson
Sorry... I sent the last message early by accident. L syslog line: Mar 20 09:59:53 baymaster-67 cib: [1846]: debug: cib_process_xpath: cib_query: //cib/status//node_state[@id='baymaster-67']//transient_attributes//nvpair[@name='pingd'] does not exist On Thu, Mar 20, 2014 at 10:15 AM, Aaron Wils

Re: [Pacemaker] Trouble getting two node cluster to failover when network lost

2014-03-20 Thread Aaron Wilson
OK, I tried the ping RA but my VIPs do not migrate when ping connection is lost. I placed my two VIPs in a group and I believe I must have something wrong with the scoring or location rules. Should I be using clone for the ping RA? What is a good way to check is ping is failing or succeeding and

[Pacemaker] crmd internal error during failover

2014-03-20 Thread Drapeau, Mathieu
Hello, >From pacemaker 1.1.8-7 from EL6, crmd died unexpected generating this logs >during a failover: crmd[10419]:error: crmd_node_update_complete: Node update 79 failed: Timer expired (-62) crmd[10419]:error: do_log: FSA: Input I_ERROR from crmd_node_update_complete() received in sta

Re: [Pacemaker] Trouble getting two node cluster to failover when network lost

2014-03-20 Thread Aaron Wilson
Thanks! On Thu, Mar 20, 2014 at 12:29 AM, Jan Friesse wrote: > Aaron Wilson napsal(a): > > Stefan, thanks for the reply. > > > > Having two nics is not for redundancy in my case. Resources on the > primary > > server are being accessed from both subnets at the same time. The > secondary > > ser

Re: [Pacemaker] 2-node cluster with shared storage: what is current solution

2014-03-20 Thread Саша Александров
Hi! I removed all clustr-related staff and installed from http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/ However, stonith-ng uses fence_* agents here... So I cannot put into crmsh primitive stonith_sbd stonith:external/sbd :-( 2014-03-19 20:14

[Pacemaker] Enabling pacemaker debug logging while running

2014-03-20 Thread Andreas Mock
Hi all, today I faced a problem which I couldn't solve reading several man pages and other found hint on the web. I have a clone of RHEL 6.5, cman based cluster and pacemaker 1.1.10+. I was able to change the value debug="on" in cluster.conf as described in the man page. I was able to propagate

Re: [Pacemaker] Trouble getting two node cluster to failover when network lost

2014-03-20 Thread Jan Friesse
Aaron Wilson napsal(a): > Stefan, thanks for the reply. > > Having two nics is not for redundancy in my case. Resources on the primary > server are being accessed from both subnets at the same time. The secondary > server is to be a failover if the server goes down or if any of the > Ethernet por

Re: [Pacemaker] pcs available on debian wheezy?

2014-03-20 Thread Vladimir
On Wed, 19 Mar 2014 16:09:12 -0500 Chris Feist wrote: > On 03/19/2014 09:17 AM, Vladimir wrote: > > Hey everyone, > > > > does anybody know if there is pcs already available on debian > > wheezy? > > > > I first tried to ask on debian-ha-maintainers (subject: crmsh and > > pcs on wheezy) but mayb