Hi all,

 

 

I'm running cfengine 3.0.4 under a number of Solaris 10 zones across a
number of Logical Domains.

 

Occasionally - it seems to be just after installation - we get a build
up of cf-twin processes, which don't seem to be hung per se - they're
consuming some cpu cycles and they have a log file open :

 

root 27065     1   0   Oct 01 ?          11:38
/var/opt/CCcfengine3/bin/cf-execd

root 25831 25830   1   Oct 01 ?        1456:15
/var/opt/CCcfengine3/bin/cf-twin -f failsafe.cf

 root 25830 27065   0   Oct 01 ?           0:00 sh -c
"/var/opt/CCcfengine3/bin/cf-twin" -f failsafe.cf &&
"/var/opt/CCcfengine

 

and then more and more  of the last two

 

Until we get this:

 

File descriptor 254 of child higher than MAX_FD, check for defunct
children

 

Or similar,  repeated in the syslog..

 

I've used gcore to get a dump of the hung processes to get a backtrace
to try and figure out what they're doing, which doesn't help me so much,
but might be able to help someone here:

 

# mdb core.198

Loading modules: [ libc.so.1 libavl.so.1 ld.so.1 ]

> $C

ffbe95a8 CompareVariable+0x14(ffbeba70, 0, ffbeba84, 315d00, ff00,
ff0000)

ffbe9610 GetVariable+0x48c(ee174, ffbed368, ffbec2e8, ffbec2ec,
ffbed37a, 3ec)

ffbec288 NewScalar+0x94(ee174, ffbed368, ffbed768, 0, ff00, 12)

ffbec2f8 Unix_GetInterfaceInfo+0xdc8(1, 31370000, 19aa48, 313700, ff00,
ff0000)

ffbfebb8 CfGetInterfaceInfo+0xc(1, decd8, dece0, d4f98, ffbfec7c,
80808080)

ffbfec18 GenericInitialize+0x104(3, ffbffd54, d9d48, 4, feaa2a00,
fec3647c)

ffbffc88 main+0x38(3, ffbffd54, ffbffd64, d4c00, feaa0140, 0)

ffbffcf0 _start+0x108(0, 0, 0, 0, 0, 0)

 

# mdb core.3469

Loading modules: [ libc.so.1 libavl.so.1 ld.so.1 ]

> $C

ffbe96f0 GetVariable+0x494(ee174, ffbed448, ffbec3c8, ffbec3cc,
ffbed45a, 3ec)

ffbec368 NewScalar+0x94(ee174, ffbed448, ffbed848, 0, ff00, 12)

ffbec3d8 Unix_GetInterfaceInfo+0xdc8(1, 31370000, 19aa48, 313700, ff00,
ff0000)

ffbfec98 CfGetInterfaceInfo+0xc(1, decd8, dece0, d4f98, ffbfed5c,
80808080)

ffbfecf8 GenericInitialize+0x104(3, ffbffe34, d9d48, 4, feaa2a00,
fec3647c)

ffbffd68 main+0x38(3, ffbffe34, ffbffe44, d4c00, feaa0140, 0)

ffbffdd0 _start+0x108(0, 0, 0, 0, 0, 0)

 

 

Does anyone have any ideas as what's happening and how it be able to be
avoided? (pretty much of a long shot, I know)

 

 

Simon

 

--

Simon Oxwell

ControlCircle

Senior Server Engineer

The Datacentre People

0044 (0)20 7517 6594

Hertsmere House, 2 Hertsmere Road,

simon.oxw...@controlcircle.com <mailto:simon.oxw...@controlcircle.com> 

London, E14 4AB

 

 

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to