Hi folks,
I'm pretty new to SpamAssassin. I have two MXes running sendmail 8.13.6,
smtp-vilter 1.2.4, SA 3.1.1 and clamav 0.88. Sendmail uses smtp-vilter
as a milter, and smtp-vilter calls clamd and spamd to scan incoming
messages at the point of delivery.
For most messages, it works fine - clamd does the virus scanning in a
flash, and SA takes between 0.1 seconds and 3 seconds to do its job.
However, for some emails, the SA scan goes over the 20 second timeout
that we currently have defined (in /etc/smtp-vilter/spamd.conf). When
this happens, sendmail tells the sender "451 4.3.2 - Please try again
later".
I turned on SA debugging for a while, and when the "451 4.3.2" error
happens, I see this in the logs:
Jun 7 15:36:56 server sendmail[573]: k575alO4000573:
from=<[EMAIL PROTECTED]>, size=10049, class=0, nrcpts=1,
msgid=<[EMAIL PROTECTED]>, proto=SMTP, daemon=MTA,
relay=[213.85.143.136]
Jun 7 15:36:56 server sendmail[573]: k575alO4000573: Milter add:
header: X-SMTP-Vilter-Version: 1.2.4
Jun 7 05:36:56 server smtp-vilter[527]: message contains no virus
Jun 7 15:36:56 server sendmail[573]: k575alO4000573: Milter add:
header: X-SMTP-Vilter-Virus-Backend: clamd
Jun 7 15:36:56 server sendmail[573]: k575alO4000573: Milter add:
header: X-SMTP-Vilter-Status: clean
Jun 7 15:36:56 server sendmail[573]: k575alO4000573: Milter add:
header: X-SMTP-Vilter-clamd-Virus-Status: clean
Jun 7 15:36:56 server spamd[484]: spamd: connection from
localhost.localdomain [127.0.0.1] at port 53693
Jun 7 15:36:56 server spamd[484]: spamd: checking message
<[EMAIL PROTECTED]> for (unknown):508
Jun 7 15:36:56 server spamd[484]: bayes: tie-ing to DB file R/O
/var/vscan/.spamassassin/bayes_toks
Jun 7 15:36:56 server spamd[484]: bayes: tie-ing to DB file R/O
/var/vscan/.spamassassin/bayes_seen
Jun 7 15:36:56 server spamd[484]: bayes: found bayes db version 3
Jun 7 15:36:56 server spamd[484]: bayes: DB journal sync: last sync:
1149658532
Jun 7 15:36:57 server spamd[484]: bayes: DB journal sync: last sync:
1149658532
Jun 7 15:36:57 server spamd[484]: bayes: corpus size: nspam = 88503,
nham = 231793
Jun 7 15:36:57 server spamd[484]: bayes: header tokens for *F =
"U*silviooleg D*lpnet.com.br D*com.br D*br"
Jun 7 15:36:57 server spamd[484]: bayes: header tokens for *R =
"U*silviooleg D*lpnet.com.br D*com.br D*br"
Jun 7 15:36:57 server spamd[484]: bayes: header tokens for *M = "
333694818070 166676502833 lpnet com br "
(then 20-30 more bayes token lines, then...)
Jun 7 15:36:57 server spamd[484]: bayes: score = 0.99938356891068
Jun 7 15:36:57 server spamd[484]: bayes: DB journal sync: last sync:
1149658532
Jun 7 15:36:57 server spamd[484]: bayes: untie-ing
Jun 7 15:36:57 server spamd[484]: bayes: untie-ing db_toks
Jun 7 15:36:57 server spamd[484]: bayes: untie-ing db_seen
(then this:)
Jun 7 05:37:16 server smtp-vilter[527]: spamd: lost header - no
response to CHECK cmd
Jun 7 05:37:16 server smtp-vilter[527]: error during spam scan of file
tmp/vilter.XXXXzJ7UnO
Jun 7 05:37:16 server smtp-vilter[527]: temporarily failing message
Jun 7 15:37:16 server sendmail[573]: k575alO4000573: Milter: data,
reject=451 4.3.2 Please try again later
Jun 7 15:37:16 server sendmail[573]: k575alO4000573:
to=<[EMAIL PROTECTED]>, delay=00:00:21, pri=40049, stat=Please
try again later
(which happens when the 20 second timeout is reached. Then this:)
Jun 7 15:37:21 server spamd[484]: learn: auto-learn: currently using
scoreset 3, recomputing score based on scoreset 1
Jun 7 15:37:21 server spamd[484]: learn: auto-learn: message score:
35.038, computed score for autolearn: 23.6
Jun 7 15:37:21 server spamd[484]: learn: auto-learn? ham=0.1, spam=12,
body-points=22.51, head-points=7.541, learned-points=3.5
Jun 7 15:37:21 server spamd[484]: learn: auto-learn? yes, spam (23.6 > 12)
Jun 7 15:37:21 server spamd[484]: learn: initializing learner
Jun 7 15:37:21 server spamd[484]: learn: learning spam
Jun 7 15:37:22 server spamd[484]: bayes: tie-ing to DB file R/W
/var/vscan/.spamassassin/bayes_toks
Jun 7 15:37:22 server spamd[484]: bayes: tie-ing to DB file R/W
/var/vscan/.spamassassin/bayes_seen
Jun 7 15:37:22 server spamd[484]: bayes: found bayes db version 3
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *F =
"U*silviooleg D*lpnet.com.br D*com.br D*br"
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *R =
"U*silviooleg D*lpnet.com.br D*com.br D*br"
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *M = "
333694818070 166676502833 lpnet com br "
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for To =
"U*auniac10 D*chec.scu.edu.au D*scu.edu.au D*edu.au D*au"
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for MIME-Version
= " "
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *c = "
Multipart/related; type="multipart/alternative"; ------------ HHHHHHHH
. HHHHHHHH"
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *RT = " "
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *RU = " [
ip=213.85.143.136 rdns= helo= by=server ident= envfrom= intl=0 id= auth=
] [ ip=196.93.229.61 rdns=linx helo=linx by=bbs.lpnet.com.br ident=
envfrom= intl=0 id= auth= ]"
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *r = "
linx (196.93.229 ip*196.93.229.61 ) by bbs.lpnet.com.br
<[EMAIL PROTECTED]>; "
Jun 7 15:37:22 server spamd[484]: bayes: header tokens for *r = "
linx (196.93.229 ip*196.93.229.61 ) by bbs.lpnet.com.br
<[EMAIL PROTECTED]>; bbs.lpnet.com.br by 213.85.143
ip*213.85.143.136 (8.9.3/8.9.3) <[EMAIL PROTECTED]>; "
Jun 7 15:37:22 server spamd[484]: bayes: learned
'[EMAIL PROTECTED]', atime: 1149658639
Jun 7 15:37:22 server spamd[484]: bayes: untie-ing
Jun 7 15:37:22 server spamd[484]: bayes: untie-ing db_toks
Jun 7 15:37:22 server spamd[484]: bayes: untie-ing db_seen
Jun 7 15:37:22 server spamd[484]: bayes: files locked, now unlocking lock
Jun 7 15:37:22 server spamd[484]: learn: initializing learner
Jun 7 15:37:22 server spamd[484]: spamd: identified spam (35.0/8.0) for
(unknown):508 in 25.1 seconds, 10287 bytes.
Jun 7 15:37:22 server spamd[484]: spamd: result: Y 35 -
BAYES_99,EXTRA_MPART_TYPE,HTML_40_50,HTML_IMAGE_ONLY_04,HTML_MESSAGE,HTML_SHORT_LINK_IMG_1,MIME_HTML_MOSTLY,RCVD_IN_DSBL,RCVD_IN_WHOIS_BOGONS,RCVD_IN_XBL,UNPARSEABLE_RELAY,URIBL_JP_SURBL,URIBL_OB_SURBL,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL
scantime=25.1,size=10287,user=(unknown),uid=508,required_score=8.0,rhost=localhost.localdomain,raddr=127.0.0.1,rport=53693,mid=<[EMAIL PROTECTED]>,bayes=0.99938356891068,autolearn=spam
So, I have a few questions:
* Anyone know why there would be such a gap between SA scanning for spam
(up to 15:36:57 in the example above) and learning spam/ham (from 15:37:21)?
* The Bayes database is in Berkeley DB format... would having it in an
RDBMS help, perhaps?
* I only turned on SA debugging for bayes and learn to get the above log
entries. Are bayes, learn and dns the only debugging flags available?
Maybe next time I should turn on dns debugging as well?
* One thing I may want to do is configure the email server to only scan
for spam for messages coming from the Internet. smtp-vilter does not
seem to have this functionality, but it appears a few other milter
packages do (amavisd-new, MIMEDefang?). Any thoughts on changing from
smtp-vilter to one of these other milter packages? Difficulties, traps
etc.? Seems like it should be pretty straightforward...
* Any other comments or thoughts?
Thanks in advance,
Guy.