Re: [SAtalk] autolearn/autowhitelist misguided

Simon Byrnand Sun, 22 Jun 2003 17:04:35 -0700

At 15:20 22/06/03 -0700, Justin Mason wrote:

Matt Kettler said:

> As for disabling the network checks for auto-learning, that makes sense to
> me as well, since the bayes code learns from text tokens, not IPs.

Actually, not quite right, if you're scanning with network tests, it'll
do the auto-learn score test with network tests as well.

Phew.... I was hopping Matt was wrong :)

But regarding the use of Bayes in auto-learn determination causing
feedback, that's the big danger.

Indeed.

BTW, one possible way to avoid FP/FNs getting into the auto-learn data
further, is to modify the learn() sub to add to the existing verification
steps:

  - recomputed hits must be < bayes_auto_learn_threshold_nonspam or
    > bayes_auto_learn_threshold_spam

Yes

- for spam, must have 3 head hits and 3 body hits

Why ? This seems a bit arbitrary to me. Either we trust the scoring or we don't :) What is magic about 3 in particular ?

add this one:

  - previous hits must be < bayes_auto_learn_threshold_nonspam or
    > bayes_auto_learn_threshold_spam

Hmm, well for starters that would prevent whitelisted spam (such as spam on this list when people have the list whitelisted) from being autolearnt, and depending on your point of view that could be a good thing or a bad thing....

Regards,
Simon

-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] autolearn/autowhitelist misguided

Reply via email to