[Pacemaker] bug in monitor timeout?

James Harper Wed, 03 Oct 2012 15:09:17 -0700

It seems like everytime I modify a resource, things start timing out. Just now 
I changed the location of where a ping resource could run and this happened:


Oct  4 07:07:07 bitvs5 lrmd: [3681]: WARN: perform_ra_op: the operation 
monitor[52] on p_lvm_iscsi:0 for client 3686 stayed in operation list for 22000 
ms (longer than 10000 ms)

Another oddity is that the resource for p_lvm_iscsi is defined as:

primitive p_lvm_iscsi ocf:heartbeat:LVM \
        params volgrpname="vg-drbd" \
        op start interval="0" timeout="30s" \
        op stop interval="0" timeout="30s" \
        op monitor interval="10s" timeout="30s"

so I don't know where the timeout of 10000ms is coming from??

When I change something with crm configure the cib process shoots up to 100% 
CPU and stays there for a while, and the node becomes more-or-less 
unresponsive, which may go some way to explaining why things time out. Is this 
normal? It doesn't explain why lrmd complains that something took longer than 
10s when I set the timeout to 30s though, unless the interval somehow interacts 
with that?

Versions of software are all from Debian Wheezy:
corosync 1.4.2-3
pacemaker 1.1.7-1

thanks

James

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] bug in monitor timeout?

Reply via email to