Re: [Pacemaker] Enable remote monitoring

Andrew Beekhof Fri, 09 Nov 2012 13:33:15 -0800

On Sat, Nov 10, 2012 at 4:54 AM, Lars Marowsky-Bree <l...@suse.com> wrote:
> On 2012-11-09T11:46:59, David Vossel <dvos...@redhat.com> wrote:
>
>> What if we made something similar to the concept of an "un-managed" 
>> resource, in that it is only ever monitored, but treated it like a normal 
>> resource.  Meaning start/stop could still execute, but start is really just 
>> the first "monitor" operation and stop just means the recurring "monitor" 
>> cancels.
>>
>> Having "start" redirect to "monitor" in pacemaker would take care of that 
>> timeout problem you all were talking about with the first failure.  Set the 
>> start operation to some larger timeout.  Basically start would just verify 
>> that monitor passed once, then you could move on to the normal monitor 
>> timeouts/intervals.  Stop would always return success and cancel whatever 
>> recurring monitors are running.
>
> That's exactly the kind of abstraction a resource agent class can
> provide though for the nagios agents - no need to have that special
> knowledge in the PE. The LRM can hide this, which is partly its
> purpose.
>
>> Now that I think about it, I'm not even sure we need the new container 
>> Andrew and I talked about at all if we introduce "monitor-only" resources.
>
> Yes. We'd still need it.
>
>> At this point we could just have a group where the first member launches the 
>> vm, and all the members after that are the monitor-only resources that 
>> start/stop similar to normal resources for the PE.  If any of the group 
>> members fail, I guess we'd need the whole group to be recovered in the right 
>> order.
>
> That's the point - "right order" for a container is not quite the right
> order as for a regular group. Basically, the group semantics would
> recover from the failed resource onward, never the VM resource
> (container).
>
> If you look at my proposal, I actually made the "container=" a group
> attribute


I think I'd rather it be a whole different tag than piggyback off the group tag.

>- because we need to map monitor failures to the container, as
> well as ignore any stop failures (service is down clean as long as the
> container is eventually stopped).
>
> I think the shell might render this differently, even if we express it
> as a group + meta-attribute(s) in the XML (which seems to be the way to
> go). "container ..." is easier on the eyes ;-)
>
>
> Regards,
>     Lars
>
> --
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, 
> HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Enable remote monitoring

Reply via email to