Re: [Pacemaker] pacemaker service start failed.

2012-11-01 Thread Yuusuke Iida
Hi, Andrew (2012/10/30 13:51), Andrew Beekhof wrote: On Mon, Oct 29, 2012 at 7:10 PM, Yuusuke Iida wrote: Hi, Andrew (2012/10/26 9:31), Andrew Beekhof wrote: When I described the IP which I used in ring0 in /etc/hosts, I confirmed that start of pacemaker succeeded. [moved first questio

Re: [Pacemaker] Build dlm_controld for pacemaker stack (dlm_controld.pcmk)

2012-11-01 Thread Andrew Beekhof
On Thu, Nov 1, 2012 at 5:09 PM, Vladislav Bogdanov wrote: > 01.11.2012 02:47, Andrew Beekhof wrote: > ... >>> >>> One remark about that - it requires that gfs2 communicates with dlm in >>> the kernel space - so gfs_controld is not longer required. I think >>> Fedora 17 is the first version with th

Re: [Pacemaker] MySQL/PostgreSQL HA cluster with Pacemaker

2012-11-01 Thread Andrew Beekhof
On Fri, Nov 2, 2012 at 2:34 AM, Andrew wrote: > On 01/11/12 03:02, Andrew Beekhof wrote: >> >> On Thu, Nov 1, 2012 at 5:27 AM, Andrew wrote: >>> >>> Hi all. >>> I try to build 1+1 MySQL HA cluster, and currently I'm looking on Percona >>> replication manager; other variant - to write own OCF for

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Angus Salkeld
On 01/11/12 17:27 -0500, Andrew Martin wrote: Hi Angus, I'll try upgrading to the latest libqb tomorrow and see if I can reproduce this behavior with it. I was able to get a coredump by running corosync manually in the foreground (corosync -f): http://sources.xes-inc.com/downloads/corosync.co

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Andrew Martin
Hi Angus, I'll try upgrading to the latest libqb tomorrow and see if I can reproduce this behavior with it. I was able to get a coredump by running corosync manually in the foreground (corosync -f): http://sources.xes-inc.com/downloads/corosync.coredump There still isn't anything added to /va

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Angus Salkeld
On 01/11/12 14:32 -0500, Andrew Martin wrote: Hi Honza, Thanks for the help. I enabled core dumps in /etc/security/limits.conf but didn't have a chance to reboot and apply the changes so I don't have a core dump this time. Do core dumps need to be enabled for the fdata-DATETIME-PID file to b

Re: [Pacemaker] MySQL/PostgreSQL HA cluster with Pacemaker

2012-11-01 Thread Denny Schierz
hi, one example config from my testcases: = primitive drbd-mysql ocf:linbit:drbd \ params drbd_resource="mysql" \ operations $id="drbd-mysql-operations" \ op monitor start-delay="0" interval="31" \ meta is-managed="true" primitive drbd-postgres ocf:

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Andrew Martin
Hi Honza, Thanks for the help. I enabled core dumps in /etc/security/limits.conf but didn't have a chance to reboot and apply the changes so I don't have a core dump this time. Do core dumps need to be enabled for the fdata-DATETIME-PID file to be generated? right now all that is in /var/lib/c

Re: [Pacemaker] MySQL/PostgreSQL HA cluster with Pacemaker

2012-11-01 Thread Andrew
On 01/11/12 03:02, Andrew Beekhof wrote: On Thu, Nov 1, 2012 at 5:27 AM, Andrew wrote: Hi all. I try to build 1+1 MySQL HA cluster, and currently I'm looking on Percona replication manager; other variant - to write own OCF for semi-synchronous master-slave replication. Also I want to run here P

Re: [Pacemaker] Pacemaker moving a sticky multi-state resource when another node is brought online

2012-11-01 Thread Tupja, Ravik
We're running version 1.1.5. So I guess this would be fixed in 1.1.8 then. -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Wednesday, October 31, 2012 9:04 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Pacemaker moving a sticky multi-sta

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Jan Friesse
Ansdrew, I was not able to find anything interesting (from corosync point of view) in configuration/logs (corosync related). What would be helpful: - if corosync died, there should be /var/lib/corosync/fdata-DATETTIME-PID of dead corosync. Can you please xz them and store somewhere (they are quiet

Re: [Pacemaker] [corosync] Corosync 2.1.0 dies on both nodes in cluster

2012-11-01 Thread Andrew Martin
Corosync died an additional 3 times during the night on storage1. I wrote a daemon to attempt and start it as soon as it fails, so only one of those times resulted in a STONITH of storage1. I enabled debug in the corosync config, so I was able to capture a period when corosync died with debug o

Re: [Pacemaker] stonithd crash on exit

2012-11-01 Thread Jacek Konieczny
On Thu, Nov 01, 2012 at 11:05:04AM +1100, Andrew Beekhof wrote: > On Thu, Nov 1, 2012 at 7:40 AM, Jacek Konieczny wrote: > > On Wed, Oct 31, 2012 at 05:33:03PM +1100, Andrew Beekhof wrote: > >> I havent seen that before. What version? > > > > Pacemaker 1.1.8, corosync 2.1.0, cluster-glue 1.0.11 >