Re: [Pacemaker] SmartOS / illumos

2013-05-10 Thread Dalho PARK
Hello, In previous error, I fixed ipc.c to do cast both void and struct and I was able to avoid having error. But I'm still having error saying file is missing. It seems I don't have signalfd.h under /usr/include/sys/ but it seems signalfd.h is linux only. Any help? gmake[2]: Entering direct

[Pacemaker] detecting resource failures after maintenance

2013-05-10 Thread Jeffrey Lewis
It seems pacemaker is not properly detecting resource failures after maintenance. Example follows. Pacemaker is managing two IPaddr2 resources. Both resources are online, and all is well. jlewis@qa3db22:~$ sudo crm resource show shard0_ip (ocf::heartbeat:IPaddr2) Started shard1_ip (ocf::heart

Re: [Pacemaker] SmartOS / illumos

2013-05-10 Thread Dalho PARK
Hello Andrew, Thank you for the advice. I applied the patches you mentioned and when I tried make, some error disappeared, but still receive the following error. Do you have any clue? Anyway, the patch will be applied to current or next version? Regards, Dalho gmake[2]: Entering directory `/

Re: [Pacemaker] resource starts but then fails right away

2013-05-10 Thread Brian J. Murrell
On 13-05-09 09:53 PM, Andrew Beekhof wrote: > > May 7 02:36:16 node1 crmd[16836]: info: delete_resource: Removing > resource testfs-resource1 for 18002_crm_resource (internal) on node1 > May 7 02:36:16 node1 lrmd: [16833]: info: flush_op: process for operation > monitor[8] on ocf::Target::

Re: [Pacemaker] ClusterMon Resource starting multiple instances of crm_mon

2013-05-10 Thread Steven Bambling
On May 10, 2013, at 5:35 AM, Steven Bambling wrote: > > On May 9, 2013, at 8:05 PM, Andrew Beekhof wrote: > >> >> On 10/05/2013, at 12:40 AM, Steven Bambling wrote: >> >>> I'm having some issues with getting some cluster monitoring setup and >>> configured on a 3 node multi-state cluster

Re: [Pacemaker] ClusterMon Resource starting multiple instances of crm_mon

2013-05-10 Thread Steven Bambling
On May 9, 2013, at 8:05 PM, Andrew Beekhof wrote: > > On 10/05/2013, at 12:40 AM, Steven Bambling wrote: > >> I'm having some issues with getting some cluster monitoring setup and >> configured on a 3 node multi-state cluster. I'm using Florian's blog as an >> example >> http://florianc

Re: [Pacemaker] crmd restart due to internal error - pacemaker 1.1.8

2013-05-10 Thread pavan tc
> > > > Well, and also Pacemaker's crmd process. > > My guess... the node is overloaded which is causing the cib queries to > time out. > > > > > > Is there a cib query timeout value that I can set? > > No. You can set the batch-limit property though, this reduces the rate at > which CIB operation