On Mon, 14 Oct 2013, Stan Hoeppner wrote:
On 10/14/2013 2:47 PM, Adam Katz wrote:
On 10/12/2013 09:26 AM, Stan Hoeppner wrote:
These two rules are adding 4.0 pts [...]
Content analysis details: (4.8 points, 4.2 required)
pts rule name description
---- ---------------------------------------------------------------------
2.8 FSL_HELO_BARE_IP_2 FSL_HELO_BARE_IP_2
1.2 RCVD_NUMERIC_HELO Received: contains an IP address used for HELO
0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60%
[score: 0.5314]
The others have addressed the "two rules" you mentioned, so I'll leave
that alone in this email.
There's more here than that: If you're using Bayes, you have to train
it. Right now, it's hurting you: Those 0.8 points should be some
negative value, perhaps -1.9 or -0.5 (the default scores for BAYES_00
and BAYES_05), which would then have made that message score 2.1 or 3.5,
both of which are below your 4.2 threshold (which is already too low!).
There's no doubt my Bayes isn't working. I ran a few hundred each of
ham and spam through sa-learn just after installing SA some year+ ago.
I haven't regularly fed it since, though I have run through maybe a few
dozen spam that weren't scored high enough. And I think I may have
inadvertently run through one or two msgs that had anti-Bayesian text
blocks in them-- the bible versus, wikipedia content, etc.
I just ran 120 hams through, about half were msgs tagged previously with
Bayes_60 through Bayes_95.
~$ sa-learn --ham --mbox --progress /home/stan/mail/ham
Learned tokens from 0 message(s) (0 message(s) examined)
Obviously there's a problem with no tokens learned. A few questions:
1. Is the database the problem? If so...
When it says "(0 message(s) examined)" that shows that it was unable
to parse -any- messages out of that input file. This tends to imply that
the contents of that "/home/stan/mail/ham" file are not a "mbox" format
or it's an empty mailbox.
First thing to fix, get your input recoginised as messages. Then see how
they're being learned.
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{