Let me preface this with the fact that I may not be providing all the
information needed to figure this one out.  I'm pretty new to
cfengine, so just let me know if there is something I can run or
output I can share that will help.

Ok, the problem: cfengine is segfaulting when it hits the "links"
section -- sometimes. It appears to have something to do with locks. If I run cfagent again immediately after it segfaults, it will finish
w/o segfaulting every time (probably because it is skipping something
due to its anti-spamming logic).

I haven't done any definitive tests, but it seems to happen more often
after link statements where I use the "pling" ( ->! ) operator to
force cfengine to make the link.  Cfengine is running every 15
minutes, so I threw in the "expireafter=5" in there in an effort to
deal with this, but it seems to have had no effect on the crashing.

------------------------------------------
Here is the line from the links section:

/usr/local/etc/freetds.conf  ->! /usr/local/nagios/misc/freetds.conf
type=absolute expireafter=5

------------------------------------------
Here is the last portion of the output when running "cfagent -v --no-splay":

Checking copy from
fulcrum:/cfeng_config/cfengine/repo/nagios/etc/nrpe.cfg to
/usr/local/nagios/etc/nrpe.cfg
Saving the setuid log in /var/cfengine/cfagent.nag4.log

*********************************************************************
Main Tree Sched: links pass 1 @ Thu May  4 11:56:56 2006
*********************************************************************

cfeng: Link (/usr/local/nagios/libexec/event->./eventhandlers) exists.
cfeng: Link (/usr/local/nagios/etc/nag_specific.cfg->./nag_specific.nag4)
exists.
cfeng: Link (/usr/local/nagios/etc/resource.cfg->./resource.nag4) exists.
Couldn't obtain lock for
lock.cfagent_conf.nag4.link._etc_init_d_nagios__usr_local_nagios_misc_nagios_3516
(already running!)
Couldn't obtain lock for
lock.cfagent_conf.nag4.link._etc_init_d_nsca__usr_local_nagios_misc_nsca_555
(already running!)
cfeng: Lock 
lock.cfagent_conf.nag4.link._usr_local_etc_freetds_conf__usr_local_nagios_misc_freetds_conf_3760
expired (after 5/5 minutes)
Trying to kill expired process, pid 4086
cfeng: Link (/usr/local/etc/freetds.conf->/usr/local/nagios/misc/freetds.conf)
exists.
Segmentation fault
------------------------------------------

I've saved the output of a "ps auxw" and looked for the pids that
cfengine is referring to and they are never there, so I'm not sure
what to make of that.

Servers are all RedHat FC4 running cfengine 2.1.20.


Any clues?

~trask

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
http://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to