Re: Really hard-to-filter spam

Thomas Cameron via users Wed, 02 Aug 2023 14:18:29 -0700

On 8/2/23 15:52, David B Funk wrote:

Regardless, if a message has never been seen before and has littlecorrelation to earlier messages its Bayes should hit someplace in the40% to 60% range.
The fact that it hit 00% indicates a strong correlation to lots of ham(or something is screwy with your Bayes).


OK, here's what I got just now:

[thomas.cameron@mail-east ~]$ sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0      41449          0  non-token data: nspam
0.000          0      49720          0  non-token data: nham
0.000          0     162741          0  non-token data: ntokens
0.000          0 1689089541          0  non-token data: oldest atime
0.000          0 1691009577          0  non-token data: newest atime

0.000 0 1691007146 0 non-token data: last journalsync atime

0.000          0 1690991018          0  non-token data: last expiry atime

0.000 0 1382400 0 non-token data: last expireatime delta0.000 0 13879 0 non-token data: last expirereduction count

I can absolutely re-train Bayes. I am kind of an email pack-rat, so Ihave over a gig of saved known good emails in various folders. I have SAset up so that emails are scanned individually on a per user basis viaprocmail rule:


[thomas.cameron@mail-east ~]$ head .procmailrc
MAILDIR=$HOME/mail
LOGFILE=$MAILDIR/procmail.log

:0fw: spamassassin.lock
* < 512000
| spamassassin

I have the users move spam to an imap folder, and then run (via theuser's cron job):


sa-learn --mbox --spam /home/[username]/mail/spam

If something is flagged as spam and it's not supposed to be, I have themcopy it to the ham folder and I run (also via cron job):


sa-learn --mbox --ham /home/[username]/mail/spam

For my email account, I've used my inbox and various other folders totrain Bayes in the past (although it's definitely been a while since Idid Bayes maintenance), but I have zero issue nuking my personal Bayesdata and starting over.


Thoughts?

--
Thomas

Re: Really hard-to-filter spam

Reply via email to