[Pacemaker] pacemaker dies without logs

2013-09-22 Thread Alessandro Bono
Hi I have a problem with a cluster where pacemaker dies without logs or something Problem started when I switched to centos 6.4 and converted cluster from corosync to cman this happen typically when system is under high load tonight I received notification of drbd split brian and found on primar

Re: [Pacemaker] pacemaker dies without logs

2013-09-22 Thread Alessandro Bono
Found logs in corosync(!?) log directory these are for primary node ga1-ext Sep 22 00:45:29 corosync [TOTEM ] A processor failed, forming new configuration. Sep 22 00:45:31 corosync [CMAN ] quorum lost, blocking activity Sep 22 00:45:31 corosync [QUORUM] This node is within the non-primary compo

[Pacemaker] Resources not configured in CIB

2013-09-22 Thread FDS | Forensik Data Services
Hi there, Analyzing a CIB configuration for getting more experience for own projects I did not find resources definition for the used ocfs2 file system. I do know from the cluster specs that the cluster was using a drbd block device (primary/primary) via iSCSI and the file sytsem is OCFS2. The

[Pacemaker] pacemaker dies with full log (was: Re: pacemaker dies without logs)

2013-09-22 Thread Alessandro Bono
Ok Sunday morning it's not good to collect information logs are in /var/log/cluster directory on both nodes these are for secondary node ga2-ext Sep 22 00:45:39 corosync [TOTEM ] Process pause detected for 20442 ms, flushing membership messages. Sep 22 00:45:45 corosync [CMAN ] quorum lost, bloc

Re: [Pacemaker] Pacemaker basic installation in CentOS 6.4

2013-09-22 Thread Andrew Beekhof
On 21/09/2013, at 3:34 AM, Gopalakrishnan N wrote: > Thought of sharing my experience with basic installation > http://gopalstech.blogspot.com/2013/09/pacemaker-basic-setup-with-cent-os-64.html > > But long way to go Since you're copying the structure and whole sentences from it, it woul

Re: [Pacemaker] exit code crm_attibute

2013-09-22 Thread Andrew Beekhof
On 20/09/2013, at 5:53 PM, Andrey Groshev wrote: > Hi again! > > Today again met a strange behavior. > I asked for a non-existent attribute of an existing node. > > # crm_attribute --type nodes --node-uname exist.node.domain.com --attr-name > notexistattibute --query ; echo $? > Could not ma

Re: [Pacemaker] Monitoring - pacemaker

2013-09-22 Thread Andrew Beekhof
On 20/09/2013, at 4:52 AM, Denise Cosso wrote: > Hi, > > > I have a cluster (2 machines) email using the pacemaker / corosync as > Active / Passive. > > Already configured filesystem (, ocf: heartbeat: Filesystem) SFEX, ping > (ocf: pacemaker: ping) and start / stop imap and saslaut

Re: [Pacemaker] pacemaker dies with full log (was: Re: pacemaker dies without logs)

2013-09-22 Thread Andrew Beekhof
I see: > Sep 22 00:45:48 [4412] ga1-ext pacemakerd:error: pcmk_cpg_dispatch: > Connection to the CPG API failed: Library error (2) > Sep 22 00:45:48 [4419] ga1-ext stonith-ng:error: pcmk_cpg_dispatch: > Connection to the CPG API failed: Library error (2) > Sep 22 00:45:48

Re: [Pacemaker] Monitoring on master node not running after standby is connected

2013-09-22 Thread Andrew Beekhof
On 20/09/2013, at 1:39 AM, Juraj Fabo wrote: > diff -urp pacemaker-Pacemaker-1.1.10.z0/crmd/lrm.c > pacemaker-Pacemaker-1.1.10/crmd/lrm.c > --- pacemaker-Pacemaker-1.1.10.z0/crmd/lrm.c2013-07-26 > 00:02:31.0 + > +++ pacemaker-Pacemaker-1.1.10/crmd/lrm.c 2013-08-27 > 10:10:5

Re: [Pacemaker] Monitoring on master node not running after standby is connected

2013-09-22 Thread Andrew Beekhof
On 20/09/2013, at 5:45 AM, Juraj Fabo wrote: > Juraj Fabo writes: > >> >> Dear all >> >> Attached is my 2-nodes, master slave cluster configuration with master-slave >> postgresql resource and some IP resources. >> I've modified pgsql resource agent to log its "main" entry with the >> parame