Re: [Pacemaker] Pengine behavior

Виталий Давудов Wed, 11 Jul 2012 23:52:41 -0700

David, thanks for your answer!

I'll try to migrate to corosync.


11.07.2012 22:40, David Vossel пишет:


----- Original Message -----

From: "Виталий Давудов" <vitaliy.davu...@vts24.ru>
To: pacemaker@oss.clusterlabs.org
Sent: Wednesday, July 11, 2012 7:34:08 AM
Subject: [Pacemaker] Pengine behavior


Hi, list!

I have configured cluster for voip application.
Here my configuration:

# crm configure show
node $id="552f91eb-e70a-40a5-ac43-cb16e063fdba" freeswitch1 \
attributes standby="off"

Ah... right here is your problem. You are using freeswitch instead of Asterisk 
:P

node $id="c86ab64d-26c4-4595-aa32-bf9d18f714e7" freeswitch2 \
attributes standby="off"
primitive FailoverIP1 ocf:heartbeat:IPaddr2 \
params iflabel="FoIP1" ip="91.211.219.142" cidr_netmask="30"
nic="eth1.50" \
op monitor interval="1s"
primitive FailoverIP2 ocf:heartbeat:IPaddr2 \
params iflabel="FoIP2" ip="172.30.0.1" cidr_netmask="16"
nic="eth1.554" \
op monitor interval="1s"
primitive FailoverIP3 ocf:heartbeat:IPaddr2 \
params iflabel="FoIP3" ip="10.18.1.1" cidr_netmask="24"
nic="eth1.552" \
op monitor interval="1s"
primitive fs lsb:FSSofia \
op monitor interval="1s" enabled="false" timeout="2s"
on-fail="standby" \
meta target-role="Started"
group HAServices FailoverIP1 FailoverIP2 FailoverIP3 \
meta target-role="Started"
order FS-after-IP inf: HAServices fs
property $id="cib-bootstrap-options" \
dc-version="1.0.12-unknown" \
cluster-infrastructure="Heartbeat" \
stonith-enabled="false" \
expected-quorum-votes="1" \
no-quorum-policy="ignore" \
last-lrm-refresh="1299964019"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"

When 1-st node was crashed, then 2-nd node become active. During this
process in ha-debug file I found lines:

...
Jul 06 17:16:42 freeswitch1 crmd: [3385]: info: start_subsystem:
Starting sub-system "pengine"
Jul 06 17:16:42 freeswitch1 pengine: [3675]: info: Invoked:
/usr/lib64/heartbeat/pengine
Jul 06 17:16:42 freeswitch1 pengine: [3675]: info: main: Starting
pengine
Jul 06 17:16:46 freeswitch1 crmd: [3385]: info: do_dc_takeover:
Taking over DC status for this partition
Jul 06 17:16:46 freeswitch1 cib: [3381]: info: cib_process_readwrite:
We are now in R/W mode
Jul 06 17:16:46 freeswitch1 cib: [3381]: info: cib_process_request:
Operation complete: op cib_master for section 'all'
(origin=local/crmd/11, version=0.391.20): ok (
rc=0)
Jul 06 17:16:46 freeswitch1 cib: [3381]: info: cib_process_request:
Operation complete: op cib_modify for section cib
(origin=local/crmd/12, version=0.391.20): ok (rc
=0)
Jul 06 17:16:46 freeswitch1 cib: [3381]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/14, version=0.391.20):
ok (rc=0)
...

After "Starting pengine", only thru 4 seconds occured next action.
What happens at this time? Is it possible to reduce this time?

I seem to remember seeing something related to this in the code at one point.  
I believe it is limited only to the use of heartbeat as the messaging layer.  
After starting the pengine, the crmd sleeps waiting for the pengine to start 
before contacting it.  The sleep is just a guess at how long it will take 
before the pengine will be up and ready to accept a connection though.  That's 
why it is so long... so the gap will hopefully be large enough that no one will 
ever run into any problems with it (I am not a big fan of this type of logic at 
all)  I'd recommend moving to corosync and seeing if this delay goes away.

-- Vossel

Thanks in advance.
--
Best regards,
Vitaly
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


--
С наилучшими пожеланиями,
Давудов Виталий Федорович
ООО "ВИП-ТЕЛЕКОМ-СЕРВИС"
(Группа компаний "ETERIA")
http://www.vts24.ru
Тел: (495) 989-47-00




_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Pengine behavior

Reply via email to