[Pacemaker] High CIB load on DC election

Cédric Dufour - Idiap Research Institute Mon, 22 Sep 2014 06:28:34 -0700

Hello again,

My PM 1.1.12 cluster is quite large: 22 nodes, ~300 resources.


When gracefully shutting down the current DC (iow. move resources elsewhere, 
node standby, pacemaker stop, corosync stop) the CIB load increases - on the 
slowest nodes to close to 100% - until the new DC gets elected.
What explains this phenomenom ?
(What could I do to limit/circumvent it ?)

In parallel, when this happens and on those nodes that display the 
"throttle_mode: High CIB load detected" message, my "ping" (network 
connectivity) RA times out without obvious explanation (the RA timeout is 
conservative enough, compared to the ping timeout/attempts, so that it should 
never kick in). Looking at the code of the ".../resource.d/pacemaker/ping", I 
suspect - though I may be wrong - the culprit is "attrd_updater".
Hypothesis: "attrd_updater" doesn't return immediately, as it is supposed to 
do, because of the high CIB load.
Does this hypothesis make sense ?
(PS: it is very difficult for me to reproduce/debug this issue, showing up on 
my production cluster, without risking to wreak havoc with my services)

Thank you very much for your response(s)

Best,

Cédric

-- 

Cédric Dufour @ Idiap Research Institute

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] High CIB load on DC election

Reply via email to