-----Ursprüngliche Nachricht----- Von: Dejan Muhamedagic <deja...@fastmail.fm> Gesendet: Fr 14.01.2011 12:31 An: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org>; Betreff: Re: [Pacemaker] Howto write a STONITH agent
> Hi, > > On Thu, Jan 13, 2011 at 09:09:38PM +0100, Christoph Herrmann wrote: > > Hi, > > > > I have some brand new HP Blades with ILO Boards (iLO 2 Standard Blade > > Edition > 1.81 ...) > > But I'm not able to connect with them via the external/riloe agent. > > When i try: > > > > stonith -t external/riloe -p "hostlist=node1 ilo_hostname=ilo1 > ilo_user=ilouser ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 > ilo_powerdown_method=power" -S > > Try this: > > stonith -t external/riloe hostlist=node1 ilo_hostname=ilo1 ilo_user=ilouser > ilo_password=ilopass ilo_can_reset=1 ilo_protocol=2.0 > ilo_powerdown_method=power -S thats much better (looks like PEBKAC ;-), thanks! But it is not reliable. I've tested it about 10 times and 5 times it hangs. That's not what I want. Finally I will use my own ssh-ilo agent. It's very simple (KISS) and reliable. The external/riloe agent did not look to simple. So my questions still remain. Is there a HOWTO for writing stonith agents. Is it usefull to write (to run) a stonith agent as cloned resource? What should the status check do with a cloned stonith resource. Is it usefull in any way? (As long as I have 4 different nodes with 4 different ilo boards.) Cheers, Christoph &:-) > Thanks, > > Dejan > > > > > I get the following answer: > > > > external/riloe[14317]: ERROR: unknown power method %s, setting to "power" > > external/riloe[14317]: ERROR: [Errno -2] Name or service not known, while > talking to ilo_hostname=ilo1 > > > > ** (process:14315): CRITICAL **: external_run_cmd: Calling > '/usr/lib64/stonith/plugins/external/riloe status' returned 1 > > > > ** (process:14315): CRITICAL **: external_status: 'riloe status' failed > > with > rc 1 > > stonith: external/riloe device not accessible. > > > > > > But I can access ilo1 with http, https and ssh. The easiest way to reset a > node is to run: > > > > ssh -i ilo-sshkey ilouser@ilo1 reset system1 > > > > I thouhgt it is easier to write a new ssh-ilo agent (I'm almost done :-) > > than > debugging the existing one. But I'm looking for a short howto. I've read some > STONITH agents, but they are not completely self-explaining and I have some > questions. Is there a short howto write a stonith agent manual which google > and > I were not able to find? > > Or should I post all questions to the list? > > here we go: > > > > 1. (and most important): What does the status check do, if you have an > > agent > which runs as cloned resource (my ssh-ilo agent should run as a cloned > resource). Does it check all nodes? Is it possible to check the status of a > single node? > > 2. What are the expected return codes? > > > > more to follow ;-) > > > > > > > > > > regards > > > > > > Christoph &:-) -- Vorstand/Board of Management: Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Michel Lepert Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker