Re: [Pacemaker] Time out issue while stopping resource in pacemaker

2014-10-13 Thread Lax
Andrew Beekhof writes: > > One does not imply the other. Stonith is arguably even more important for 2-node clusters. Ok, will try it out. > > > > > One more thing, on another setup with same configuration while running > > pacemaker I keep getting 'gfs_controld[10744]: daemon cpg_join error

Re: [Pacemaker] Raid RA Changes to Enable ms configuration -- need some assistance plz.

2014-10-13 Thread Andrew Beekhof
On 14 Oct 2014, at 12:58 am, Errol Neal wrote: > Andrew Beekhof writes: > >>> >>> Here is my full pacemaker config: >>> >>> http://pastebin.com/jw6WTpZz >>> >>> My understanding is that in order for N to start, N+1 must already > be >>> running. So my configuration (to me) reads that the

Re: [Pacemaker] [Problem]When Pacemaker uses a new version of glib, g_source_remove fails.

2014-10-13 Thread renayama19661014
Hi Andrew, The problem was settled with your patch. Please merge a patch into master. Please confirm whether there is not a problem in other points either concerning g_timeout_add() and g_source_remove() if possible. Many Thanks! Hideo Yamauchi. - Original Message - > From: "renayam

Re: [Pacemaker] Time out issue while stopping resource in pacemaker

2014-10-13 Thread Andrew Beekhof
On 14 Oct 2014, at 5:11 am, Lax wrote: > Andrew Beekhof writes: > > >> I'm guessing you don't have stonith? >> >> The underlying philosophy is that the services pacemaker manages need to > exit before pacemaker can. >> If the service can't stop, it would be dishonest of pacemaker to do so. >

Re: [Pacemaker] Bandwidth Requirement

2014-10-13 Thread Andrew Beekhof
On 13 Oct 2014, at 11:49 pm, Sahil Aggarwal wrote: > Hello Andrew , > > Thanx for the response ... > > You are requested to solve one more query ... > > we generally user 2 to 10 nodes in a cluster using multicasting , and also > using Postgres Databse replication under cluster . > > c

Re: [Pacemaker] Time out issue while stopping resource in pacemaker

2014-10-13 Thread Lax
Andrew Beekhof writes: > I'm guessing you don't have stonith? > > The underlying philosophy is that the services pacemaker manages need to exit before pacemaker can. > If the service can't stop, it would be dishonest of pacemaker to do so. > > If you had fencing, it would have been able to cle

Re: [Pacemaker] communications problems in cluster

2014-10-13 Thread Саша Александров
Hi! Most likely related... I have node vm-vmwww with remote-node vmwww. Both are reported online (vmwww:vm-vmwww) and vm-vmwww is reported as 'started on wings1'. However, when I try to cleanup faulty failed action " vmwww_start_0 on wings1 'unknown error' (1): call=100, status=Timed Out ", here i

Re: [Pacemaker] Raid RA Changes to Enable ms configuration -- need some assistance plz.

2014-10-13 Thread Errol Neal
Andrew Beekhof writes: > > > > Here is my full pacemaker config: > > > > http://pastebin.com/jw6WTpZz > > > > My understanding is that in order for N to start, N+1 must already be > > running. So my configuration (to me) reads that the ms_md0 master > > resource must be started and running

[Pacemaker] communications problems in cluster

2014-10-13 Thread Саша Александров
Hi! I was building a cluster with pacemaker+pacemaker-remote (CentOS 6.5, everything from the official repo). While I had several resources, everything was fine. However, when I added more VMs (2 nodes and 10 VMs currently) I started to run into problems (see below). Strange thing is that when I

Re: [Pacemaker] Fencing of movable VirtualDomains

2014-10-13 Thread Daniel Dehennin
Andrew Beekhof writes: [...] > Is the ipaddr for each device really the same? If so, why not use a > single 'resource'? No, sorry, the IP addr was not the same. > Also, 1.1.7 wasn't as smart as 1.1.12 when it came to deciding which fencing > device to use. > > Likely you'll get the behaviou