Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 9:00 PM, John McCabe wrote: > No joy with ipport sadly > > > value="10"/> > > Can you share the changes you made to fence_rhevm for the API change? I've > got what *should* be the latest packages from the HA channel on both systems. > > > On Wed, May 22, 2013 at 11:34

Re: [Pacemaker] stonith-ng: error: remote_op_done: Operation reboot of node2 by node1 for stonith_admin: Timer expired

2013-05-22 Thread Andrew Beekhof
On 17/05/2013, at 12:23 AM, Brian J. Murrell wrote: > Using Pacemaker 1.1.8 on EL6.4 with the pacemaker plugin, I'm finding > strange behavior with "stonith-admin -B node2". It seems to shut the > node down but not start it back up and ends up reporting a timer > expired: > > # stonith_admin -

Re: [Pacemaker] pacemaker-1.1.10 results in Failed to sign on to the LRM 7

2013-05-22 Thread Andrew Beekhof
On 17/05/2013, at 1:15 PM, Andrew Widdersheim wrote: > I'm attaching 3 patches I made fairly quickly to fix the installation issues > and also an issue I noticed with the ping ocf from the latest pacemaker. > > One is for cluster-glue to prevent lrmd from building and later installing. > May

[Pacemaker] Release candidate: 1.1.10-rc3

2013-05-22 Thread Andrew Beekhof
Announcing the third release candidate for Pacemaker 1.1.10 This RC is a result of work in several problem areas reported by users, some of which date back to 1.1.8: * manual fencing confirmations * potential problems reported by Coverity * the way anonymous clones are displayed * handling of r

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 9:44 PM, Brian J. Murrell wrote: > Using pacemaker 1.1.8-7 on EL6, I got the following series of events > trying to shut down pacemaker and then corosync. The corosync shutdown > (service corosync stop) ended up spinning/hanging indefinitely (~7hrs > now). The events, includi

Re: [Pacemaker] trouble with quorum

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 10:25 PM, Groshev Andrey wrote: > Hello, > > I try build cluster with 2 nodes + one quorum node (without pacemaker). This is the root of your problem. Your config has: > service { > name: pacemaker > ver: 1 > } So even though you thought you only started co

Re: [Pacemaker] trouble with rebuilding pacemaker rpm package for CentOS

2013-05-22 Thread Andrew Beekhof
On 23/05/2013, at 1:04 AM, Халезов Иван wrote: > Hello everyone! > > I decided to update my pacemaker installation to the lastest version in > CentOS 6.4 repository. > > For some reasons we need to use corosync 2.3 in our system. So i had to > rebuilt pacemaker with corosync 2.3 support. I t

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Mike Edwards
My apologies for being unclear on that - I'm using the corosync 1.4.1 rpm provided by CentOS/RHEL 6.4. I'll try using the member objects within the interface block to see if that has my setup behave any better. Thanks! On Wed, May 22, 2013 at 05:12:43PM +0200, Jan Friesse babbled thus: > Actual

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Mike Edwards
It doesn't appear that either multicast or udpu works for me - I'm just using straight ip-to-ip udp. On Wed, May 22, 2013 at 05:10:05PM +0200, Jan Friesse babbled thus: > As long as UDP (multicast) works for you, it's better solution (better > tested, faster, ...). UDPU is targeted for deployment

[Pacemaker] can someone post their 2-node mysql/drbd cluster config?

2013-05-22 Thread christopher barry
Hi all, I'm trying to put together a 2 node mysql cluster using drbd as the db backing store. I have 4 interfaces in each node: * 2 nics bonded and x-over cabled between each node for drbd data sync and heartbeats * 1 'public' nic, and * 1 'private' nic The public and private nics each have a non

Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread John McCabe
FYI - I've opened a ticket on the RH bugzilla ( https://bugzilla.redhat.com/show_bug.cgi?id=966150) against the fence_agents component. On Wed, May 22, 2013 at 12:00 PM, John McCabe wrote: > No joy with ipport sadly > > value="443"/> > name="shell_timeout" value="10"/> > > Can you share the

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
Actually, I've reviewed that config file again and it looks like you are using corosync 1.x. There nodelist is really not supported, and supported is member object inside of interface (see corosync.conf.example.udpu). For corosync 2.x, member object inside interface object works also, but it's inte

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
Mike Edwards napsal(a): > Which would be the recommended trqansport? I'm not tied to any > particular method. > As long as UDP (multicast) works for you, it's better solution (better tested, faster, ...). UDPU is targeted for deployments where multicast is problem. Regards, Honza > > On Wed

[Pacemaker] trouble with rebuilding pacemaker rpm package for CentOS

2013-05-22 Thread Халезов Иван
Hello everyone! I decided to update my pacemaker installation to the lastest version in CentOS 6.4 repository. For some reasons we need to use corosync 2.3 in our system. So i had to rebuilt pacemaker with corosync 2.3 support. I took pacemaker-1.1.8-7.el6.src.rpm package from CentOS vault r

Re: [Pacemaker] Post script question

2013-05-22 Thread Florian Crouzat
Le 22/05/2013 15:02, Daniel Gullin a écrit : Hi, is there is any possibilities to use a “post-script” when a failover has happened ? I have a corosync/pacemaker installation with two services, filesystem and IP and two nodes. The system should be in passive/active mode. When a failover has happe

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Mike Edwards
Yep. The config I pasted has the bindnetaddr set to 10.10.23.50, which also happens to be defined as node 1. On Wed, May 22, 2013 at 09:28:13AM +0200, Jan Friesse babbled thus: > Mike, > did you entered local node in nodelist? Because this may explain > behavior you were describing. > > Honza

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Mike Edwards
Which would be the recommended trqansport? I'm not tied to any particular method. On Wed, May 22, 2013 at 10:01:37AM +1000, Andrew Beekhof babbled thus: > I think nodelist only works for corosync 2.x > So if you want to use udpu you might need to look up the corosync 1.x syntax. --

[Pacemaker] Post script question

2013-05-22 Thread Daniel Gullin
Hi, is there is any possibilities to use a "post-script" when a failover has happened ? I have a corosync/pacemaker installation with two services, filesystem and IP and two nodes. The system should be in passive/active mode. When a failover has happen the passive node should mount the shared disk

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Mike Edwards
Emmanuel, this bug appears to refer to functionality in cman. We're using pcs to manage corosync/pacemaker. Thanks. On Tue, May 21, 2013 at 10:55:19PM +0200, emmanuel segura babbled thus: > https://bugzilla.redhat.com/show_bug.cgi?id=657041 --

[Pacemaker] trouble with quorum

2013-05-22 Thread Groshev Andrey
Hello, I try build cluster with 2 nodes + one quorum node (without pacemaker). The sequence of actions like the following: 1. setup/start corosync on TREE nodes - all right. # corosync-quorumtool -l|sed 's/\..*$//' Nodeid    Votes  Name 295521290    1  dev-cluster2-node2 312298506    1  dev-clust

[Pacemaker] error: do_exit: Could not recover from internal error

2013-05-22 Thread Brian J. Murrell
Using pacemaker 1.1.8-7 on EL6, I got the following series of events trying to shut down pacemaker and then corosync. The corosync shutdown (service corosync stop) ended up spinning/hanging indefinitely (~7hrs now). The events, including a: May 21 23:47:18 node1 crmd[17598]:error: do_exit: C

Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread John McCabe
No joy with ipport sadly Can you share the changes you made to fence_rhevm for the API change? I've got what *should* be the latest packages from the HA channel on both systems. On Wed, May 22, 2013 at 11:34 AM, Andrew Beekhof wrote: > > On 22/05/2013, at 7:31 PM, John McCabe wrote: > > >

Re: [Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread Andrew Beekhof
On 22/05/2013, at 7:31 PM, John McCabe wrote: > Hi, > I've been trying to get fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) > working within pacemaker (pacemaker-1.1.8-7.el6.x86_64) but am unable to get > it to work as intended, using fence_rhevm on the command line works as > expected,

[Pacemaker] fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) not working with pacemaker (pacemaker-1.1.8-7.el6.x86_64) on RHEL6.4

2013-05-22 Thread John McCabe
Hi, I've been trying to get fence_rhevm (fence-agents-3.1.5-25.el6_4.2.x86_64) working within pacemaker (pacemaker-1.1.8-7.el6.x86_64) but am unable to get it to work as intended, using fence_rhevm on the command line works as expected, as does stonith_admin but from within pacemaker (triggered by

[Pacemaker] Patrik Rapposch is out of the office

2013-05-22 Thread Patrik . Rapposch
Ich werde ab 22.05.2013 nicht im Büro sein. Ich kehre zurück am 25.05.2013. Sehr geehrte Damen und Herren, ich bin bis einschließlich 24.05 auf Dienstreise. Trotzdem versuche ich Ihr Anliegen so schnell als möglich zu beantworten. Bitte wenden Sie sich für Netzwerk bezogene Anliegen immer ksi.ne

Re: [Pacemaker] Does "stonith_admin --confirm" work?

2013-05-22 Thread Raoul Bhatia [IPAX]
Hello Andrew! On 2013-05-20 06:43, Andrew Beekhof wrote: [...] Well, thats not nothing, but it certainly doesn't look right either. I will investigate. Which version is this? I'm running Debian GNU/Linux 6.0 Squeeze 64bit latest patch level with the current backports packages: pacemaker 1.1.

Re: [Pacemaker] Pacemaker 1.1.8 and corosync's cpg service?

2013-05-22 Thread Jan Friesse
Mike, did you entered local node in nodelist? Because this may explain behavior you were describing. Honza Mike Edwards napsal(a): > On Tue, May 21, 2013 at 11:15:56AM +1000, Andrew Beekhof babbled thus: >> cpg_join() is returning CS_ERR_TRY_AGAIN here. >> >> Jan: Any idea why this might happen?