Re: [SAtalk] Losing the war against spammers

Odhiambo Washington Thu, 08 Jan 2004 09:34:08 -0800

* Genchev, Sergei <[EMAIL PROTECTED]> [20040108 19:13]: wrote:
> >I have some mail that was received by this particular user. I have put
> >the tarbal here: http://ns2.wananchi.com/~wash/SPAM/ and it is in
> >Maildir/ format.
> 
> >Looking forward to your observations/suggestions/recommendations.
> 
> I ran your files through my spamassasin setup an got 6 E-mails that scored
> lower than my threshold (5.4 sitewide). A few of emails hit my custom rules
> and vast majority of them were BAYES_99 or BAYES_90.
> Correct me if I am wrong, but by looking at your scores it seems to me that
> your spamassassin setup is configured to use both network checks and bayes.


Yes, but the bayes portion isn't working. Why? Because when SA runs, it
runs with the privileges of the user whose mail is being scanned. I
believe bayes would require that SA (spamc) is run under a specific
user, no?


> None of your mails have any BAYES scores. Did you train your bayes database
> properly? 

Well, when I run 'spamassassin -D --lint', part of the output is:

<cut>

debug: bayes: 95778 tie-ing to DB file R/O
/usr/local/spamd/.spamassassin/bayes_toks
debug: bayes: 95778 tie-ing to DB file R/O
/usr/local/spamd/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 3 chosen.
debug: Initialising learner
debug: running header regexp tests; score so far=0
debug: running body-text per-line regexp tests; score so far=2.077
debug: bayes corpus size: nspam = 6978, nham = 1554
debug: uri tests: Done uriRE
debug: tokenize: header tokens for *F = "U*ignore
D*compiling.spamassassin.taint.org D*spamassassin.taint.org D*taint.org
D*org"
debug: tokenize: header tokens for *m = " 1073579062 lint_rules "
debug: bayes token 'somewhat' => 0.00664197530864198
debug: bayes token 'H*F:D*org' => 0.988731707317073
debug: bayes token 'N:H*m:NNNNNNNNNN' => 0.020705859016258
debug: bayes: score = 0.417326798181323
debug: bayes: 95778 untie-ing
debug: bayes: 95778 untie-ing db_toks
debug: bayes: 95778 untie-ing db_seen
</cut>

So, yes, I did train bayes. It's just that it's not in use. I have to
figure out how to get it to be used in my situation.


> Do you actually use network checks and/or bayes? If not, tell
> spamassassin that you don't, this way most standard scores will be higher. 

Hey, thanks for that advise. I will try telling SA to NOT use bayes and
see the difference. That would also mean I change site threshold to,
say, 5.0, yes?


> I HIGHLY recommend using bayes though. Network checks are great but they could
> be slow so I personally cannot afford to run them.

I really like to use bayes, but it looks like my configuration for spam
filtering doesn't allow it. I run Exim and I only do spam filtering via
procmail (for some political reasons).




        cheers
       - wash 
+----------------------------------+-----------------------------------------+
Odhiambo Washington                     . WANANCHI ONLINE LTD (Nairobi, KE)  |
<wash at wananchi dot com>              . 1ere Etage, Loita Hse, Loita St.,  |
GSM: (+254) 722 743 223                 . # 10286, 00100 NAIROBI             |
GSM: (+254) 733 744 121                 . (+254) 020 313 985 - 9             |
+---------------------------------+------------------------------------------+
"Oh My God! They killed init! You Bastards!"  
                                                 --from a /. post

smime.p7s
Description: S/MIME cryptographic signature

Re: [SAtalk] Losing the war against spammers

Reply via email to