On Thu, Oct 25, 2012 at 1:37 AM, Cal Heldenbrand <c...@fbsdata.com> wrote: > Thanks Andrew! My first few attempts at playing around with the failure > states are working as expected. > > A few follow-ups below: > > >> --op-fail isn't the command you want though. >> From the man page: >> >> -i, --op-inject=value >> $rsc_$task_$interval@$node=$rc - Inject the specified >> task before running the simulation >> >> -F, --op-fail=value >> $rsc_$task_$interval@$node=$rc - Fail the specified task >> while running the simulation >> >> Note the difference between the two descriptions: before vs. while. >> --op-inject is the one you want. It is mostly useful for pretending a >> recurring monitor failed and seeing what the cluster would do about >> it. >> >> --op-fail on the other hand, is used for pretending that part of the >> recovery process failed. > > > Your follow up description here is great, and makes more sense. I was > reading "Fail the specified task" as literally, "here's my task, fail it and > show me the results" I'd suggest to add a little paragraph in the man page > to elaborate these points too.
Ok, I'll add that today. > Also, can you tell me what all of the return > codes are? Do I have to use integers, or do strings like "error" work? Just integers I'm afraid. The full list for OCF agents is here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/s-ocf-return-codes.html LSB return codes are slightly different. > While we're on the subject of documentation / usability, I would also > suggest to split out these two features into more parameters. (What would > happen if I named my resource with an underscore?) Maybe something like: > > --op-pre-resource=[primitive name] > --op-pre-task=[monitor|start|stop] > --op-pre-interval=[integer] > --op-pre-node=[hostname] > --op-pre-rc=[error|timeout|other stuff] > > Then have similar --op-post-* parameters. Or whatever verbs make the most > sense in the spirit of Pacemaker vocabulary. (pre/post, before/after, > inject/fail, input/output, etc) The reason for not doing that, is that we wanted to be able to inject multiple pre/post failures at a time and see the result. > And, examples are always awesome in man > pages too. > > Of course, this is all great future version stuff, but that doesn't help all > of the RedHat 6 people that will be using pacemaker 1.1 packages for the > next ~10 years until RedHat 7 comes out. Don;t worry, the man page updates we just talked about will be in the 6.4 packages :) > So I suppose documenting the old > code in the online docs is a Good Thing. :-) > > Thanks again! > > --Cal > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org