On Wed, Nov 30, 2011 at 1:26 PM, James Harper <james.har...@bendigoit.com.au> wrote: >> > >> > That thread goes around in circles and completely contradicts what > I'm >> > seeing. What I'm seeing is that unmanaged resources are never > monitored. >> >> would be strange and how do you verify this? A look at your config may > also >> help to shed some light on this ... >> > > The relevant portions of the config are: > > primitive p_xen_smtp2 ocf:heartbeat:Xen \ > params name=" smtp2" xmfile="/configs/xen/smtp2" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="300s" \ > op migrate_from interval="0" timeout="300s" \ > op migrate_to interval="0" timeout="300s" \ > op monitor interval="10s" timeout="30s" \ > meta allow-migrate="true" > > property $id="cib-bootstrap-options" \ > dc-version="1.0.11-6e010d6b0d49a6b929d17c0114e9d2d934dc8e04" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1322100376" > rsc_defaults $id="rsc-options" \ > resource-stickiness="200" > > I just tested the following (it actually contradicts some of my previous > statements... but I'm including it anyway as it wasn't what I expected): > > . VM is running on node bitvs6 as a managed resource > . I type "crm resource unmanage p_xen_smtp2" > . crm status is "Started bitvs6 (unmanaged)" > . I manually stop the VM outside crm > . A few seconds later, the status is " Started bitvs6 (unmanaged) > FAILED" with a failed action " p_xen_smtp2_monitor_10000 (node=bitvs6, > call=70, rc=7, status=complete): not running"... so okay... it did > monitor a managed and _running_ resource, even though it resulted in an > error
So far so good. > . I type "crm resource cleanup p_xen_smtp2" What for? This has the side effect of stopping any recurring monitor action that was running. > . hangs for ages at "Waiting for 3 replies from the CRMd.No messages > received in 60 seconds.." then finally says "aborting" > . I type "crm resource stop p_xen_smtp2" > . hangs for a bit then says " Call cib_replace failed (-41): Remote node > did not respond" That doesn't look good at all. At a guess, it seems like something crashed. If you want to file a bug and attach a crm_report I'll take a look. > > Any further attempt to do anything with this resource just hangs... > maybe the Xen RA monitor script is broken? I can only fix it by starting > the VM manually so that the actual status matches crm's expected > resource status. > > So starting again to demonstrate the problem: > . VM is running on node bitvs6 as a managed resource > . I type "crm resource stop p_xen_smtp2" > . VM shuts down as expected > . I type "crm resource unmanage p_xen_smtp2" > . I manually start the VM outside of crm > . crm _never_ notices that the resource is started unless I do something > like "crm resource cleanup p_xen_smtp2" to manually cause the monitoring > script to be run The 1.1.x series will detect this if you specify a recurring monitor with role=Stopped, but its not the default behaviour because, well, "don't do that". > > Now the above is all about unmanaged resources, but this VM is one I > could rebuild easily enough so now I'm going to get tricky: > > . VM is running on node bitvs6 as a managed resource > . I type "crm resource stop p_xen_smtp2" > . VM shuts down as expected > . I manually start the VM outside of crm > . crm still _never_ notices that the resource is started unless I do > something like "crm resource cleanup p_xen_smtp2" to manually cause the > monitoring script to be run As above. > > This really is unexpected behaviour... starting the resource in crm > causes the right things to happen (notices that the resource is running) > but I still expected that a stopped resource would be monitored... No, not by default. There should be only one point of control, you're creating an internal split-brain by telling the cluster to control the resource AND doing so yourself in parallel. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org