[SAtalk] Rule scoring: autolearn instead of GA?

Heinz Ulrich Stille Wed, 09 Jul 2003 04:49:25 -0700

Hi!

Just an idle thought: Why don't we start over and do everything differently?
;-)


No, really. I'm wondering whether the rules couldn't be dynamically re-scored
like bayes auto-learning does. The benefits I see:
- Rules the spammers have caught on to drop out automatically, instead having
to wait for the next SA release,
- the message score is a probability instead of a difficult to scale value,
- it should be easy to just throw in some new rule and see how it fares: If
it's not good enough, the value will stay at "indeterminate". I've seen an
other thread about collecting custom rules; this could help to get them
in without much cogitating about their merits.

Perhaps there should be a confidence value in addition to the spamicity, for
bayes as well as the rules (at least I think the bayes algorithm hasn't one
and just cuts in when the overall token count is high enough):
Not only count in how many spam and non-spam a token was seen, but how often
it came up at all - so that, when a new rule matched once in a thousand mails,
and that was spam, it doesn't necessarily mean the next mail it matches also
is spam. If some rule stays at a low weight, it could be deactivated until
further notice to save time.

Of course, all this may be complete nonsense - the last few nights I mainly
was coding instead of sleeping, and I fear that caffeine might affect my
judgement...

MfG, Ulrich

-- 
Heinz Ulrich Stille / Tel.: +49-541-9400463 / Fax: +49-541-9400450
design_d gmbh / Lortzingstr. 2 / 49074 Osnabrück / www.design-d.de



-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

[SAtalk] Rule scoring: autolearn instead of GA?

Reply via email to