On Wed, Feb 6, 2013 at 1:50 PM, Tim Serong <[email protected]> wrote:
> On 02/05/2013 10:31 PM, Lars Marowsky-Bree wrote:
>> On 2013-02-05T11:36:30, Ulrich Windl <[email protected]> 
>> wrote:
>>
>> This looks like a support incident to me. Hard to diagnose without full
>> logs.
>>
>>> Let me add: I'm not completely sure, but a side-effect of this messages 
>>> seems to be that resources (being cleaned up) that are running (e.g. Xen 
>>> VMs) are considered "stopped". If the CRM tried to start the VM elsewhere, 
>>> data corruption or other bad effects are likely...
>>>
>>> So I wonder: I thought that cleaning up a resource just resets the 
>>> failed-count for the nodes where the resource couldn't start before. Does 
>>> it (should it?) really clean the "running" status?
>>
>> This part is normal. Cleanup removes the resources state from the
>> cluster/LRM completely (this includes the failure counts), which is then
>> reprobed.
>>
>> This does not cause concurrency violations. Even though it is true that
>> the resource shows up as "not running" briefly in crm_mon/hawk.
>>
>> Perhaps a new state "not probed" would be useful, since the
>> probe_complete attribute is available in the CIB? Cc'ing Tim for his
>> opinion.
>
> Good point.  Even if it's generally only a brief window where resources
> are shown as stopped after cleanup (even though they're never actually
> stopped), that could be confusing.  In Hawk's case, the status display
> is implemented such that resources with no LRM state are reported as
> Stopped, where strictly they should probably show as Unknown (or, as you
> say, "Not Probed").  I'll make a note to do something about that.
>
> I'm not sure why crm_mon seems to show non-probed resources as Stopped
> (it's been some time since I went digging through the pengine/unpack code).

Probably just the default.
If you wanted to add a "verified" flag so that crm_mon/hawk can
display "Assumed" or "Unverified" or similar... :)

>
> Regards,
>
> Tim
> --
> Tim Serong
> Senior Clustering Engineer
> SUSE
> [email protected]
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to