On Sat, Nov 10, 2012 at 4:54 AM, Lars Marowsky-Bree <l...@suse.com> wrote: > On 2012-11-09T11:46:59, David Vossel <dvos...@redhat.com> wrote: > >> What if we made something similar to the concept of an "un-managed" >> resource, in that it is only ever monitored, but treated it like a normal >> resource. Meaning start/stop could still execute, but start is really just >> the first "monitor" operation and stop just means the recurring "monitor" >> cancels. >> >> Having "start" redirect to "monitor" in pacemaker would take care of that >> timeout problem you all were talking about with the first failure. Set the >> start operation to some larger timeout. Basically start would just verify >> that monitor passed once, then you could move on to the normal monitor >> timeouts/intervals. Stop would always return success and cancel whatever >> recurring monitors are running. > > That's exactly the kind of abstraction a resource agent class can > provide though for the nagios agents - no need to have that special > knowledge in the PE. The LRM can hide this, which is partly its > purpose. > >> Now that I think about it, I'm not even sure we need the new container >> Andrew and I talked about at all if we introduce "monitor-only" resources. > > Yes. We'd still need it. > >> At this point we could just have a group where the first member launches the >> vm, and all the members after that are the monitor-only resources that >> start/stop similar to normal resources for the PE. If any of the group >> members fail, I guess we'd need the whole group to be recovered in the right >> order. > > That's the point - "right order" for a container is not quite the right > order as for a regular group. Basically, the group semantics would > recover from the failed resource onward, never the VM resource > (container). > > If you look at my proposal, I actually made the "container=" a group > attribute
I think I'd rather it be a whole different tag than piggyback off the group tag. >- because we need to map monitor failures to the container, as > well as ignore any stop failures (service is down clean as long as the > container is eventually stopped). > > I think the shell might render this differently, even if we express it > as a group + meta-attribute(s) in the XML (which seems to be the way to > go). "container ..." is easier on the eyes ;-) > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org