Hi, On Wed, Mar 14, 2012 at 10:35:01PM +0100, Arnold Krille wrote: > On Wednesday 14 March 2012 17:52:21 Dejan Muhamedagic wrote: > > On Wed, Mar 14, 2012 at 02:48:11PM +0100, Benjamin Kiessling wrote: > > > Hi, > > > > > > On 2012.03.14 14:24:10 +0100, Dejan Muhamedagic wrote: > > > > > dnsCache_start_0 (node=router1, call=56, rc=-2, status=Timed Out): > > > > > unknown exec error dnsCache_monitor_1000 (node=router2, call=24, > > > > > rc=1, status=complete): unknown error > > This one exited with a generic error. Didn't notice that. The RA > > should've logged the reason. > > > > > dnsCache_start_0 (node=router2, call=81, rc=-2, status=Timed Out): > > > > > unknown exec error> > > > > > These operations timed out, i.e. didn't finish in the given time > > > > frame which is by default 20 seconds. > > > > > > It says the return code is -2 which isn't a return code specified in the > > > OCF standard. unbound usually starts fast and I can't see anything in > > > the logs indicating an error during initialization. > > > > Negative exit codes are special and cannot be produced by a > > script. > > Negative exit-codes are "special" in that they commonly denote an error while > positive exit-codes might be regular results of the app/script running. > And there is no difference between a script and a "real" program when it > comes > to returning exit-codes. > > You might mean that either the RA-script or the cluster-software itself can't > return negative exit-codes...
I meant that an RA (should've said that instead of "script") cannot return a negative exit code. > > Hmm, I've always thought that "Timed Out" in that > > message above is unequivocal. > > "Timed out" is one of the errors. And when you have some positive exit-codes > for "the script went well but the state of the resource is <bla>", its > perfectly okay to use negative exit-codes to signal things like "the RA > script > didn't execute" or "the RA script took to long to execute"... Eh? A positive exit code always comes from the RA. A negative one from the lrmd which means that for whatever reason the RA instance couldn't run or didn't finish. Thanks, Dejan > Arnold > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org