On Wed, 2008-12-03 at 02:00 -0800, Björn K wrote:
> Hello,
> 
> I am relatively new to SpamAssassin and have some problems with email which
> seems to get completely different scores when I check them manually than
> when the automatic check upon reception by the Exim mail server is
> performed.
> 
> Before we use an own spam filter the mail was put into an imap folder for an
> external mail service to be read (GMX), filtered and forwarded back to
> another mail box. That system is still working for parts. When a mail is

Despite mentioning IMAP folders -- I assume this involves forwarding to
another SMTP or polling by GMX? If so, SA likely can not detect all this
properly and thus tests some of these "internal" forwarding relays
against blacklists, instead of the actually handing over external one.
As a result, quite a lot of DNSBLs will not trigger and your SA performs
less effective than it could.

You can fix this by tweaking trusted_networks and internal_networks. But
that wasn't your question. :)

> transferred like this I can see the spam score being evaluated twice. For
> example there was a mail containing only a link to dagwizhua -dot- com,
> which is a bad address. It received 6.8 on first run, 3.6 on the second run
> only for a few additional headers added by the external mail service.

This difference might actually be due to the trust path outlined above.
If GMX does polling, they could have correctly tested the external
handing over relay against blacklists.

Your local run doesn't show any such hits.

> However, when I copied the mail into a text file and used spamc to send it
> the /same/ spamd process I got this result:
> [EMAIL PROTECTED]:~$ LANG=C spamc -lR < spam-mail.txt | recode latin1..utf8
> 12.9/5.0

For some better evaluation, we'd need the full X-Spam headers, both as
inserted by your local SA on the first run *and* the manual second run.
Don't have that, so here's a guess.

> Pkte Regelname              Beschreibung
> ---- ---------------------- --------------------------------------------------
>  0.6 NO_REAL_NAME           Kein vollständiger Name in Absendeadresse
>  1.8 INVALID_DATE           Datumskopfzeile nicht standardkonform zu RFC 2822
>  0.0 UNPARSEABLE_RELAY      Informational: message has unparseable relay lines
>  1.3 RCVD_IN_BL_SPAMCOP_NET RBL: Transportiert via Rechner in Liste von
>                             www.spamcop.net
>                [Blocked - see <http://www.spamcop.net/bl.shtml?203.145.146.3>]

This is about 3.6 (assuming some rounding), the score your first run
ended up with.

>  3.3 URIBL_AB_SURBL         Enthält URL in AB-Liste (www.surbl.org)
>                             [URIs: dagwizhua -dot- com]
>  2.6 URIBL_OB_SURBL         Enthält URL in OB-Liste (www.surbl.org)
>                             [URIs: dagwizhua -dot- com]
>  3.6 URIBL_SC_SURBL         Enthält URL in SC-Liste  (www.surbl.org)
>                             [URIs: dagwizhua -dot- com]

These are moving targets. It is entirely possible that the URI
blacklists haven't caught up when you initially scanned the mail -- and
thus they didn't hit on the first run, but later only.

> -0.2 AWL                    AWL: From: address is in the auto white-list

Computed based on the sender/IP-block history.

> How can the results be so very different on the same spam process? Why would
> a few additional headers make a difference if the Bayes does not seem to add
> anything to the mail and there is no particular rule for those headers? And
> why does a manual scan produce a completely different result if the service
> that creates the actual results is the same process?

See above. It's likely not about the headers, but timing -- that URI
simply hasn't been on the blacklists before.

The difference to the GMX score probably is due to the trust path. Plus
the SA version used and thus the scores per rule. Don't remember
off-hand which SA version GMX uses, but I do see you're running an old
version, aren't you? The scores (and rules, mind you) don't match a
recent SA 3.2.x.

  guenther


-- 
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to