On 12/16/2009 9:42 AM, Rajkumar S wrote:
On Wed, Dec 16, 2009 at 1:07 PM, Yet Another Ninja<sa-l...@alexb.ch> wrote:
I don't do any "manual" training, ever. SA's butler, "autolearn", does it
for me.
bayes_auto_learn 1
In this case if a new spam comes and it does not score on any other
rules, Would't this be classified as a ham? Also I need bayes to help
me with border line cases, like those scoring say 3 - 5 if my
required_score is 6.5. Most of the new spam that get past score in the
range of 3 - 5 in my system. auto learn does not help here either.
I am also testing auto learn, just wondering how others are handling
these issues.
The primary defense against zero-day spam... is, I think, to greylist.
Hopefully, by the time it comes around again to retry, the honeypot
projects will have blacklisted the IP address or URL in various
blacklists. (Or it will be listed in Pyzor, Razor, DCC...)
In general, I don't rely on auto-learn for the marginal stuff, too big a
chance that it will learn incorrectly. So I don't train if the message
falls inside the -2 to +10 score range. What does fall inside that
range gets manually sorted into "train as spam/ham" folders.