Folks -

Consider a running cluster with all resources managed. We want to stop and quickly restart a particular resource without impacting other resources. The software stack running on the system can deal with this sort of temporary outage. We perform the following actions:
  * unmanage the resource
  * stop the resource
  * start the resource
  * manage the resource

The above procedure is sometimes successful. However, we will also sometimes get a resource monitor failure after stopping the resource. It is clear that the monitor operation was not stopped (at least not immediately) by unmanaging the resource.

Will the monitor operation get stopped when a resource is unmanaged?
If so, how long will it take for this to occur?
What determines this length of time?
Is there a better way to do a quick restart of a resource without impacting 
other resources?
(In our case the resource is a member of a resource group)

Example:
 Resource Group: dmfGroup
     CXFS       (ocf::sgi:cxfs):        Started genesis
     VirtualIP  (ocf::heartbeat:IPaddr2):       Started genesis
     TMF        (ocf::sgi:tmf): Started genesis (unmanaged)
     DMF        (ocf::sgi:dmf): Started genesis
     DMFMAN     (ocf::sgi:dmfman):      Started genesis
     DMFSOAP    (ocf::sgi:dmfsoap):     Started genesis

Log messages ...
genesis:~ # tail -f /var/log/messages | grep TMF
Apr  1 09:26:28 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:27:09 genesis root: unmanage TMF
Apr  1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: +         <primitive 
id="TMF" >
Apr 1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: + <meta_attributes id="TMF-meta_attributes" > Apr 1 09:27:09 genesis cib: [5740]: info: log_data_element: cib:diff: + <nvpair id="TMF-meta_attributes-is-managed" name="is-managed" value="false" __crm_diff_marker__="added:top" />
Apr  1 09:27:09 genesis pengine: [5743]: info: native_add_running: resource TMF 
isnt managed
Apr 1 09:27:09 genesis pengine: [5743]: notice: native_print: TMF (ocf::sgi:tmf): Started genesis (unmanaged) Apr 1 09:27:09 genesis pengine: [5743]: info: native_color: Unmanaged resource TMF allocated to genesis: active
Apr  1 09:27:09 genesis pengine: [5743]: notice: LogActions: Leave resource TMF 
(Started unmanaged)
Apr  1 09:27:12 genesis crm_resource: [5219]: info: native_add_running: 
resource TMF isnt managed
Apr  1 09:27:12 genesis crm_resource: [5222]: info: native_add_running: 
resource TMF isnt managed
Apr  1 09:28:28 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:30:29 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor
Apr  1 09:30:32 genesis crm_resource: [6428]: info: native_add_running: 
resource TMF isnt managed
Apr  1 09:30:32 genesis crm_resource: [6431]: info: native_add_running: 
resource TMF isnt managed
Apr  1 09:32:29 genesis lrmd: [5741]: debug: rsc:TMF:18: monitor

--

Ron Kerry         rke...@sgi.com
Global Product Support - SGI Federal

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to