On Tue, Nov 6, 2012 at 11:59 PM, Lars Marowsky-Bree <l...@suse.com> wrote: > On 2012-11-06T19:30:20, "Gao,Yan" <y...@suse.com> wrote: > > Hi Yan, > > thanks for proposing this. > > Let me try to add - > > The proposal has essentially three parts. > > First, like Yan said, a new resource agent class so that we can wrap > around the Icinga/nagios plugins, provide meta-data, etc. This is quite > separate from the other components, and fairly straightforward. It also > means that someone could configure these as a (unmanaged?) primitive in > case they just want to gather monitor data and make stuff depend on it. > > This is hopefully not very controversial; after all, it's why we have > agent classes. ;-) > > > Second, the ability to specify a different class/(provider/)type for a > monitor op. This neatly allows us to pull in those probes for the > "monitor a VM use case", with hopefully minimal impact (on the PE or the > schema, where only optional attributes would be added), and also be > straightforward to configure for admins. (Clearly, the shell/hawk would > need to be taught about this so that it is easy too.) It may have > applications beyond this though. > > And no, I'm not proposing that we allow overriding the > class/provider/type tuple for start/stop ;-)
Did you consider having the VirtualDomain do the nagios redirect for monitor operations? If so, what was the drawback? > > > Third, since the "start" of the base container may return before the > guest is fully booted (to stick with the VM resource), we may need an > additional timeout here. We *could* abuse start-delay (which might > finally give it some legitimate use), but looping until we got the first > success also appears attractive. My concern there is that there needs to be a finite termination point for the "its still bad" looping. No better ideas yet though. > > The one downside here is that, unless we modify the PE or make the > update to the CIB special somehow, the tools can't show that those > ops/services aren't yet reporting healthy. But I think this trade-off is > acceptable. And might be useful in other scenarios too. You do or dont want to show them as unhealthy? I'm not parsing this well. > > Fourth, since this means we'll have multiple monitors doing different > things, this draws attention to the deficiency in Pacemaker where > monitor intervals clash, and something that really should be fixed > eventually - it affects the slave/master resources too, and is a bit > arcane knowledge to need on the admin'ss side. True. I wouldn't mind getting that fixed. Doing it in a backwards compatible manner might be tricky though. > > I think all of these four pillars have merit on their own, and combined > would provide the use case that we wish to cover quite neatly. We'd be > contributing the first three, and would really appreciate if "someone" > could look at number 4 ;-) > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org