Hi all,
I'm running cfengine 3.0.4 under a number of Solaris 10 zones across a number of Logical Domains. Occasionally - it seems to be just after installation - we get a build up of cf-twin processes, which don't seem to be hung per se - they're consuming some cpu cycles and they have a log file open : root 27065 1 0 Oct 01 ? 11:38 /var/opt/CCcfengine3/bin/cf-execd root 25831 25830 1 Oct 01 ? 1456:15 /var/opt/CCcfengine3/bin/cf-twin -f failsafe.cf root 25830 27065 0 Oct 01 ? 0:00 sh -c "/var/opt/CCcfengine3/bin/cf-twin" -f failsafe.cf && "/var/opt/CCcfengine and then more and more of the last two Until we get this: File descriptor 254 of child higher than MAX_FD, check for defunct children Or similar, repeated in the syslog.. I've used gcore to get a dump of the hung processes to get a backtrace to try and figure out what they're doing, which doesn't help me so much, but might be able to help someone here: # mdb core.198 Loading modules: [ libc.so.1 libavl.so.1 ld.so.1 ] > $C ffbe95a8 CompareVariable+0x14(ffbeba70, 0, ffbeba84, 315d00, ff00, ff0000) ffbe9610 GetVariable+0x48c(ee174, ffbed368, ffbec2e8, ffbec2ec, ffbed37a, 3ec) ffbec288 NewScalar+0x94(ee174, ffbed368, ffbed768, 0, ff00, 12) ffbec2f8 Unix_GetInterfaceInfo+0xdc8(1, 31370000, 19aa48, 313700, ff00, ff0000) ffbfebb8 CfGetInterfaceInfo+0xc(1, decd8, dece0, d4f98, ffbfec7c, 80808080) ffbfec18 GenericInitialize+0x104(3, ffbffd54, d9d48, 4, feaa2a00, fec3647c) ffbffc88 main+0x38(3, ffbffd54, ffbffd64, d4c00, feaa0140, 0) ffbffcf0 _start+0x108(0, 0, 0, 0, 0, 0) # mdb core.3469 Loading modules: [ libc.so.1 libavl.so.1 ld.so.1 ] > $C ffbe96f0 GetVariable+0x494(ee174, ffbed448, ffbec3c8, ffbec3cc, ffbed45a, 3ec) ffbec368 NewScalar+0x94(ee174, ffbed448, ffbed848, 0, ff00, 12) ffbec3d8 Unix_GetInterfaceInfo+0xdc8(1, 31370000, 19aa48, 313700, ff00, ff0000) ffbfec98 CfGetInterfaceInfo+0xc(1, decd8, dece0, d4f98, ffbfed5c, 80808080) ffbfecf8 GenericInitialize+0x104(3, ffbffe34, d9d48, 4, feaa2a00, fec3647c) ffbffd68 main+0x38(3, ffbffe34, ffbffe44, d4c00, feaa0140, 0) ffbffdd0 _start+0x108(0, 0, 0, 0, 0, 0) Does anyone have any ideas as what's happening and how it be able to be avoided? (pretty much of a long shot, I know) Simon -- Simon Oxwell ControlCircle Senior Server Engineer The Datacentre People 0044 (0)20 7517 6594 Hertsmere House, 2 Hertsmere Road, simon.oxw...@controlcircle.com <mailto:simon.oxw...@controlcircle.com> London, E14 4AB
_______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine