On 2012-11-09T11:46:59, David Vossel <[email protected]> wrote:
> What if we made something similar to the concept of an "un-managed" resource,
> in that it is only ever monitored, but treated it like a normal resource.
> Meaning start/stop could still execute, but start is really just the first
> "monitor" operation and stop just means the recurring "monitor" cancels.
>
> Having "start" redirect to "monitor" in pacemaker would take care of that
> timeout problem you all were talking about with the first failure. Set the
> start operation to some larger timeout. Basically start would just verify
> that monitor passed once, then you could move on to the normal monitor
> timeouts/intervals. Stop would always return success and cancel whatever
> recurring monitors are running.
That's exactly the kind of abstraction a resource agent class can
provide though for the nagios agents - no need to have that special
knowledge in the PE. The LRM can hide this, which is partly its
purpose.
> Now that I think about it, I'm not even sure we need the new container Andrew
> and I talked about at all if we introduce "monitor-only" resources.
Yes. We'd still need it.
> At this point we could just have a group where the first member launches the
> vm, and all the members after that are the monitor-only resources that
> start/stop similar to normal resources for the PE. If any of the group
> members fail, I guess we'd need the whole group to be recovered in the right
> order.
That's the point - "right order" for a container is not quite the right
order as for a regular group. Basically, the group semantics would
recover from the failed resource onward, never the VM resource
(container).
If you look at my proposal, I actually made the "container=" a group
attribute - because we need to map monitor failures to the container, as
well as ignore any stop failures (service is down clean as long as the
container is eventually stopped).
I think the shell might render this differently, even if we express it
as a group + meta-attribute(s) in the XML (which seems to be the way to
go). "container ..." is easier on the eyes ;-)
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org