Hi, On Wed, Aug 25, 2010 at 08:56:08PM +0200, Cnut Jansen wrote: > Am 25.08.2010 16:00, schrieb Dejan Muhamedagic: > > Hi, > > > > On Tue, Aug 24, 2010 at 05:19:23PM +0200, Cnut Jansen wrote: > >> Hi, > >> > >> just (for now) a short question for to make sure I didn't miss anything: > >> What's the designated reaction of Pacemaker when a resource agents > >> called for monitoring a resource, which is supposed to run and thus > >> resulting in a return of 0 (OCF_SUCCESS), returns 7 (OCF_NOT_RUNNING)? > >> Shall Pacemaker's very next call be for stopping the resource or shall > >> it be yet another (or even several) monitorings? > > > > It should be stop, followed by start, either on the same node or > > on another depending on the migration-threshold setting and > > failcount. > > Ok, that's what I expected. > So there are neither so-far-unknown-to-me circumstances where it's by > design that Pacemaker - after having gotten a rc=7 from the RA; and for > adding a "FAILED" behind the resource in crm_mon, it obviously also > understood it correctly - calls the RA yet another several times for > monitoring (while letting the rest of the cluster hang) before finally > calling the desired stop, instead of immediately calling the RA for > stopping and continueing with the pending transactions and migrations.
Yes, that sounds quite unusual. > I'll first try to reproduce that on my cluster at home too, reduce the > configuration to reproductional minimum and then might give a more > detailed description for this issue. > > > >> Or are there various designated reactions to this case, depending on > >> various conditions or something? > > > > This is the default. You can change it by setting the "on-fail" > > attribute for the monitor (or any other) operation. > > Allowed values are [ignore, block, restart, stop, fence], default is > restart, and there's no value, option or whatever like > on-fail="repeat-op[-N-times]" or something, right? Right. > (btw., jfyi: migration-thresholds are currently completely banned out of Why? Anything wrong with them? > my configurations, so this is another issue; I probably also might have > yet another issue / possible bug regarding zombie-(monitor-)operations, > with symptoms like of an off-by-one-error) Please file a bugzilla if you find a bug. Thanks, Dejan > > > > Thanks, > > > > Dejan > > > >> Cnut Jansen > >> > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: > >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker