Hi, We confirmed movement when we set freeze in no-quorum-policy. In the cluster that freeze setting became effective, we stopped the service.
However, a stop of the service took time very much. We set "shutdown-escalation" for five minutes to shorten the time for test. But, a stop of the service of one node takes time more than five minutes. I confirmed it in the next procedure. Step1) Start four nodes and send cib.xml. Step2) Intercept Heartbeat communication and divide it in two nodes. Step3) The node does freeze. Step4) In two divided one nodes, we stop Hearbeat at the same time. [r...@srv03 ~]# service heartbeat stop Stopping High-Availability services: [r...@srv04 ~]# service heartbeat stop Stopping High-Availability services: Step5) Heartbeat of one node stops in a few minutes. [r...@srv04 ~]# service heartbeat stop Stopping High-Availability services: [ OK ] Step6) But, Heartbeat of one node does not stop anymore unless, furthermore, time passes. * The timer of shutdown-escalation starts, but time when we set it(5min) does not seem to become effective. [r...@srv03 ~]# service heartbeat stop Stopping High-Availability services: [ OK ] Oct 21 16:46:57 srv03 crmd: [4432]: info: do_shutdown_req: Sending shutdown request to DC: srv03 Oct 21 16:46:57 srv03 crmd: [4432]: info: handle_shutdown_request: Creating shutdown request for srv03 (state=S_IDLE) Oct 21 16:53:07 srv03 cib: [4428]: info: cib_stats: Processed 805 operations (38149.00us average, 5% utilization) in the last 10min Oct 21 16:57:20 srv03 crmd: [4432]: ERROR: crm_timer_popped: Shutdown Escalation (I_STOP) just popped! Oct 21 16:57:20 srv03 crmd: [4432]: ERROR: do_log: FSA: Input I_STOP from crm_timer_popped() received in state S_IDLE Oct 21 16:57:20 srv03 crmd: [4432]: info: do_state_transition: State transition S_IDLE -> S_STOPPING [ input=I_STOP cause=C_TIMER_POPPED origin=crm_timer_popped ] Oct 21 16:57:20 srv03 crmd: [4432]: info: do_dc_release: DC role released Oct 21 16:57:20 srv03 crmd: [4432]: info: stop_subsystem: Sent -TERM to pengine: [5007] Is it right movement to take time to this service stop? * Because the log was very big, I did not attach it. * If log is necessary, I send it in Bugzilla. Best Regards, Hideo Yamauchi. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker