Every so often I get spampd complaining about a time-out while SA is trying to interact with one of my eval functions. I've watched the logs, and what basically happens is that the plug-in *sometimes* goes to sleep when one the (current) first eval rule in a batch is activated. It seems to hit a couple of them mroe than others, which is what strikes me as the most suspicious.
Verbatim log data below is absolutely typical: Mar 21 05:10:59 goose spampd[28292]: debug: running raw-body-text per-line regexp tests; score so far=-99.95 Mar 21 05:10:59 goose spampd[28292]: debug: running full-text regexp tests; score so far=-99.95 Mar 21 05:10:59 goose spampd[28292]: debug: ClamAV: No virus detected Mar 21 05:10:59 goose spampd[28292]: debug: DCCifd is not available: no r/w dccifd socket found. Mar 21 05:10:59 goose spampd[28292]: debug: Running tests for priority: 500 Mar 21 05:10:59 goose spampd[28169]: Failed to run LDAP_MSG_FROM_LIGHT SpamAssassin test, skipping: (Timed out! ) Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: from=apache.org helo=apache.org by=ehsco.com Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: from=smtp-vbr11.xs4all.nl helo=smtp-vbr11.xs4all.nl by=apache.org Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: from=webmail7.xs4all.nl helo=webmail.xs4all.nl by=smtp-vbr11.xs4all.nl Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: mismatch on HELO: 'webmail.xs4all.nl' != 'webmail7.xs4all.nl' Mar 21 05:10:59 goose spampd[28169]: debug: forged-HELO: from=adsl.xs4all.nl helo= by=webmail.xs4all.nl Everything is hunky-dorey and then poop no-habla-API... You can tell that the plug-in itself wasn't even activated because there's no debug output from it. Also, once spampd recovers it goes right into the next set of tests, and on the next message the plug-in will be working fine again... LDAP_MSG_FROM_LIGHT is by far the most common rule to be cited, and yes it works fine when it doesn't trigger a suicide pact. It is the 23rd eval rule in the cf, if that means anything. I've done some back-end debugging, and there aren't any protocol problems like dropped connections or anything that would suggest network trouble (I've even switched to LDAPI sessions via UNIX domain socket and it still happens). So my first guess is that something in the plug-in has gone into blocking mode. This doesn't seem to happen with any other plug-ins, so I'm guessing this has something to do with one of the modules I'm using, or there's something about my plug-in that's keeping , but does this ring any bells for anybody? Could there be too many open eval calls (there are a couple of dozen in my LDAP cf), excess garbage that needs to be collected more frequently, or anything like that? The only other thing I can think of would be that the LDAP server itself is blocked, but like I said the protocol traces don't show any problems, and it's a pretty common cluster of crashes, mostly failing on the same rules (but succeeding on them the majority of the time). Quite curious. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/