Hi Aleksey

I haven't been burnt by this (yet) so you may want to take this with
reservation :)

I'd go for a double approach here. As you say, I'd count root's
instances of cf-agent and use abortclasses to give up if the count is
higher than 1. To be more effective, this check should happen as early
as possible, and will give a long running cf-agent a chance to complete.

However, nobody can guarantee that a cf-agent run won't hang forever
(for whatever reason). In that case, the approach above would completely
block your cfengine instance. So, at some regular interval, e.g.: once
every 4 hours, I'd do the same check and actively kill spurious
instances. Of course, this check must happen _before_ the one above, and
should be the really first thing the agent does.



On 07/09/12 05:49, Aleksey Tsalolikhin wrote:
> 1. Abort the CFEngine agent run if there is an earlier instance of it
> already running.  (Do this via abortclasses)

Like this, but with the "safety net" as above.


> 2. Make a promise that cf-agent will kill earlier instances of
> cf-agent.  I noticed cf-agent won't signal itself, haven't played
> around if it will signal another instance of cf-agent.  even if it's
> averse to shooting another cf-agent, we can kill it using an external
> shell command.

Don't like this: if a cf-agent legitimately takes more than 5 mins to
complete, that won't give it a chance to complete a run.


> 3. Set a timeout on every commands type promise (doesn't really
> address the scenario where a complete native cf-agent run takes 5
> minutes and 01 seconds, so runs overlap, thus loading the host server)

Never used timeout, so no clue.

Ciao
-- bronto

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to