I noticed the same thing. Seems like this can be a bit of problem for those of
us
who would like to collect spam (and mis-classified ham) that that is later fed
to sa-learn, or
that is used to calibrate local scores via mass check. It is my understanding
that the
Bayes scoring, and various rules process the header and look at the data in the
received
lines. They can't do this if the Received headers aren't there. Worse,
Spamassassin
will appear to be a word that often shows up in spam (because of the string
"with SpamAssassin (2.55 1.174.2.19-2003-05-19-exp)" that appears in the
rewritten
Received headers. This seems like a potential rather serious bug.

One way out, might be to first run 'spamassassin -e' to just test the message
as to whether it is spam or not, and if it is, pass the message unaltered into
a spam mbox that will be used for later calls to sa-learn, and for scoring.
Then run SA again to get the helpful spam report, with attachment. Something
like
this in procmail:

SA=/usr/bin/spamassassin

# (use spamassassin.lock to serialize calls, and serialize access to
# both spam.mbox and spam_with_report.mbox.)

:0:spamassassin.lock
* ! ? $SA -e
{

# it is spam, deposit an unaltered copy into spam.mbox
:0c
spam.mbox

# now run SA the regular way to get a report, and deposit
# the result into spam_with_report.mbox
:0w
| $SA >> spam_with_report.mbox

}

There's a subtle possible race condition here. It is possible (though not
likely)
that the network checks could return a different result, or that auto-learn
might
adjust Bayes scores, or that auto-whitelist (if enabled) could cause the second
call to SA to return a different score than the first call (the one executed
with
the -e switch). It could turn out that a spam deposited in spam.mbox in its
raw, as delivered form, is not classified as spam in the subsequent call that
delivers
it to spam_with_report.mbox.

Better would be (I think) if SA didn't mangle the Received headers; my
understanding is
this may be due to a side-effect of using certain mail handling library
routines.




> -----Original Message-----
> From: Mark Miller
> Sent: Tuesday, August 05, 2003 7:31 AM
> To: [EMAIL PROTECTED]
> Subject: [SAtalk] Header rewrite question
>
>
>  Question... Since I upgraded to 2.55, messages that have been
> marked as spam
> and tossed into the spam folder are missing a lot of the path headers.  For
> example, below is the header of a busted message.  Most of the path has been
> truncated and only the 'localhost' remains.  What would cause this?  I run
> Sendmail 8.12.9 and procmail.
>
>
> -----------------------------------------------------
> >From [EMAIL PROTECTED]  Tue Aug  5 03:57:19 2003
> Received: from localhost [127.0.0.1] by quantum.paraphysics.com
>     with SpamAssassin (2.55 1.174.2.19-2003-05-19-exp);
>     Tue, 05 Aug 2003 03:57:25 -0500
>




-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to