Re: [Pacemaker] heartbeat:anything resource not stop/monitoring after reboot

2013-09-05 Thread Andrew Beekhof
On 06/09/2013, at 1:23 AM, David Coulson wrote: > We patched and rebooted one of our clusters this morning - I verified that > pacemaker is the same as previous, plus it matches another similar cluster. > > There is a resource in the cluster defined as: > > primitive re-named-reload ocf:heart

[Pacemaker] heartbeat:anything resource not stop/monitoring after reboot

2013-09-05 Thread David Coulson
We patched and rebooted one of our clusters this morning - I verified that pacemaker is the same as previous, plus it matches another similar cluster. There is a resource in the cluster defined as: primitive re-named-reload ocf:heartbeat:anything \ params binfile="/usr/sbin/rndc" cmdli

Re: [Pacemaker] Corosync quorum not updating on split node

2013-09-05 Thread Mark Round
Just a quick follow up - I had this answered on the Corosync mailing list (which I guess should have been the place for this anyway). As I was blocking all traffic with iptables, it was also blocking lo, which caused all sorts of things to break. As soon as I only blocked on eth0, things started

Re: [Pacemaker] Resource ordering/colocating question (DRBD + LVM + FS)

2013-09-05 Thread Andreas Mock
Hi Heikki, just some comments for helping yourself. 1) The second output of crm_mon show a resource IP_database which is not shown in the initial crm_mon output and also not in the config. => Reduce your problem/config to the minimum being reproducible. 2) Enable logging and look out which node

[Pacemaker] Resource ordering/colocating question (DRBD + LVM + FS)

2013-09-05 Thread Heikki Manninen
Hello, I'm having a bit of a problem understanding what's going on with my simple two-node demo cluster here. My resources come up correctly after restarting the whole cluster but the LVM and Filesystem resources fail to start after a single node restart or standby/unstandby (after node comes b

Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Christine Caulfield
On 05/09/13 11:33, Andrew Beekhof wrote: On 05/09/2013, at 6:37 PM, Christine Caulfield wrote: On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev wrote:

[Pacemaker] Corosync quorum not updating on split node

2013-09-05 Thread Mark Round
Hi all, I have a problem whereby when I create a network split/partition (by dropping traffic with iptables), the victim node for some reason does not realise it has split from the network. It seems to recognise that it can't form a cluster due to network issues, but the status is not reflecte

Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Andrew Beekhof
On 05/09/2013, at 6:37 PM, Christine Caulfield wrote: > On 03/09/13 22:03, Andrew Beekhof wrote: >> >> On 03/09/2013, at 11:49 PM, Christine Caulfield wrote: >> >>> On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev wrote: > > >

Re: [Pacemaker] Howto recover from node state UNCLEAN (online)

2013-09-05 Thread Lars Marowsky-Bree
On 2013-09-05T12:23:23, Andreas Mock wrote: > - resource monitoring failed on node 1 > => stop of resource on node 1 failed > => stonith off node 1 worked > - more or less parallel as resource is clone resource > resource monitoring failed on node 2 > => stop of resource on node 2 failed

[Pacemaker] Howto recover from node state UNCLEAN (online)

2013-09-05 Thread Andreas Mock
Hi all, is there a way to recover from node state UNCLEAN (online) without rebooting? Background: - RHEL6.4 - cman-cluster with pacemaker - stonith enabled and working - resource monitoring failed on node 1 => stop of resource on node 1 failed => stonith off node 1 worked - more or less pa

Re: [Pacemaker] different behavior cibadmin -Ql with cman and corosync2

2013-09-05 Thread Christine Caulfield
On 03/09/13 22:03, Andrew Beekhof wrote: On 03/09/2013, at 11:49 PM, Christine Caulfield wrote: On 03/09/13 05:20, Andrew Beekhof wrote: On 02/09/2013, at 5:27 PM, Andrey Groshev wrote: 30.08.2013, 07:18, "Andrew Beekhof" : On 29/08/2013, at 7:31 PM, Andrey Groshev wrote: 29.08.2