Hey Ron I have the following in a promise. You may want to adjust the values of "ago" there. I execute cf-agent once an hour. If I see a cf-agent process older than 2 hours old, I have the current execution kill that process and I raise a class that I report on. You could use this same thing to kill cf-exced.
Hope this helps. Really, you should look at the verbose output of cf-agent -v to determine what the agent is hanging on. I've usually found this to be something like a stale NFS mount point. Cheers Mike processes: linux|sunos_5_10:: "cf-agent" handle => "verify_cf_agent_doesnt_pile_up", process_select => cfagent_cruft, signals => {"kill"}, classes => if_repaired("cfagent_haywire"); ########################################################################### ######### body process_select cfagent_cruft { command => ".*cf-agent$"; # argments for the ago function # arg1 : Years, in the range 0,1000 # arg2 : Months, in the range 0,1000 # arg3 : Days, in the range 0,1000 # arg4 : Hours, in the range 0,1000 # arg5 : Minutes, in the range 0,1000 # arg6 : Seconds, in the range 0,40000 # Kill any cf-agent process thats been lingering around, but stop from -2 hours ago so we dont kill our current execution. stime_range => irange(0,ago(0,0,0,2,0,0)); process_result => "command.stime"; } On 6/21/12 11:01 AM, "Ron Parker" <rdpar...@gmail.com> wrote: >In testing that the promises for a given machine were sufficient to >reconfigure it from scratch, I did a fresh OS install to a VM, >bootstrapped CFE and manually started cf-agent. After about 30 >minutes and the fourth email from the system, it had converged. I >noted a few things that were missing, e.g. ssh configuration. So I >made the policy changes over the course of the day. > >Last night before leaving I rolled the VM back to the baseline OS >install and bootstrapped CFE to test my changes. After about 10 >minutes I got an email reporting what happened during initial run but >no reports thereafter. This morning I get in and find the machine has >182 (and climbing) copies of cf-execd and cf-agent. Other than my >seemingly minor tweaks the only difference I am aware of is that the >second time I did not run cf-agent manually at all, I let cf-execd >start the processes. > >I suspect it is somehow related to initial package installation and >there is a possibly related discussion on the list from last year >https://cfengine.com/forum/read.php?3,19505,19598, but I saw no clear >resolution. > >My questions are, how can I see what the active copy of cf-agent is >doing that it did not complete? It does not have any child processes >showing up in pstree. > >The second question is how may I prevent this in the future but still >have the system converge in a reasonable amount of time? > >-- >Ron Parker >Don't type things you find on the Internet into your computer! >:(){ :|:&};: >_______________________________________________ >Help-cfengine mailing list >Help-cfengine@cfengine.org >https://cfengine.org/mailman/listinfo/help-cfengine _______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine