On 2013-02-05T11:36:30, Ulrich Windl <[email protected]> wrote:
This looks like a support incident to me. Hard to diagnose without full
logs.
> Let me add: I'm not completely sure, but a side-effect of this messages seems
> to be that resources (being cleaned up) that are running (e.g. Xen VMs) are
> considered "stopped". If the CRM tried to start the VM elsewhere, data
> corruption or other bad effects are likely...
>
> So I wonder: I thought that cleaning up a resource just resets the
> failed-count for the nodes where the resource couldn't start before. Does it
> (should it?) really clean the "running" status?
This part is normal. Cleanup removes the resources state from the
cluster/LRM completely (this includes the failure counts), which is then
reprobed.
This does not cause concurrency violations. Even though it is true that
the resource shows up as "not running" briefly in crm_mon/hawk.
Perhaps a new state "not probed" would be useful, since the
probe_complete attribute is available in the CIB? Cc'ing Tim for his
opinion.
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems