On Wed, Feb 6, 2013 at 1:50 PM, Tim Serong <[email protected]> wrote: > On 02/05/2013 10:31 PM, Lars Marowsky-Bree wrote: >> On 2013-02-05T11:36:30, Ulrich Windl <[email protected]> >> wrote: >> >> This looks like a support incident to me. Hard to diagnose without full >> logs. >> >>> Let me add: I'm not completely sure, but a side-effect of this messages >>> seems to be that resources (being cleaned up) that are running (e.g. Xen >>> VMs) are considered "stopped". If the CRM tried to start the VM elsewhere, >>> data corruption or other bad effects are likely... >>> >>> So I wonder: I thought that cleaning up a resource just resets the >>> failed-count for the nodes where the resource couldn't start before. Does >>> it (should it?) really clean the "running" status? >> >> This part is normal. Cleanup removes the resources state from the >> cluster/LRM completely (this includes the failure counts), which is then >> reprobed. >> >> This does not cause concurrency violations. Even though it is true that >> the resource shows up as "not running" briefly in crm_mon/hawk. >> >> Perhaps a new state "not probed" would be useful, since the >> probe_complete attribute is available in the CIB? Cc'ing Tim for his >> opinion. > > Good point. Even if it's generally only a brief window where resources > are shown as stopped after cleanup (even though they're never actually > stopped), that could be confusing. In Hawk's case, the status display > is implemented such that resources with no LRM state are reported as > Stopped, where strictly they should probably show as Unknown (or, as you > say, "Not Probed"). I'll make a note to do something about that. > > I'm not sure why crm_mon seems to show non-probed resources as Stopped > (it's been some time since I went digging through the pengine/unpack code).
Probably just the default. If you wanted to add a "verified" flag so that crm_mon/hawk can display "Assumed" or "Unverified" or similar... :) > > Regards, > > Tim > -- > Tim Serong > Senior Clustering Engineer > SUSE > [email protected] > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
