I can see the need to correct  mis-learns in the bayes database quickly.
However I still do not think the autolearn feature is the best solution to
training spamassassin effectively. Using the autolearn feature one has to
balance aggressive thresholds, which learn more emails, with conservative
thresholds which make less mistakes. Whatever happens the bayes database
will not be taught with the complete set of all mail arriving at a
particular account, and will therefore not be at it's most effective.

I guess this is what it says on the man page for as-learn, and the answer is
that I need to use supervised training if I wish to get the best spam
detection from spamassassin.

Can anyone recommend a minimum hassle way for supervised training? I can use
both IMAP an POP boxes for my account.

Thanks

Adam


"Matt Kettler" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> At 10:33 AM 11/18/03 +0000, Adam Griffiths wrote:
> >It seems to me that the autolearn feature is too withstrictive as to
which
> >mails it chooses to autolearn from, where as my plan would autolean from
> >_all_ the mail I receive, and only require I keep and eye out for any
> >mistakes, and correct them.
>
> If you're simply concerned that the autolearn thresholds are too
> restrictive, you can always just change the thresholds to more aggressive
> numbers... The only big restriction is that to learn as spam it always
> requires 3 header and 3 body points... But you can lower the
> bayes_auto_learn_threshold_spam from 12 to 6 and increase the number of
> emails SA autolearns quite a bit. You can also raise
> bayes_auto_learn_threshold_nonspam  to a number higher than the default
0.1.
>
>
> This would result in fewer mis-learns  that would need to be hand
corrected
> compared to learning everything. You'd still have some, but it would be
> fewer to fix.
>
> The biggest drawback of the "learn everything and hand fix the mistakes"
is
> that unless you hand fix in more-or-less realtime, your mis-learning is
> going to cause related emails to be misclassified until you re-learn
them..
> This could be a continuous uphill battle on any account with a decent
> mixture of spam and nonspam.
>
>
>
> -------------------------------------------------------
> This SF. Net email is sponsored by: GoToMyPC
> GoToMyPC is the fast, easy and secure way to access your computer from
> any Web browser or wireless device. Click here to Try it Free!
> https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl





-------------------------------------------------------
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to