Re: [SAtalk] Bayes.

Chris Petersen Wed, 14 Jan 2004 14:46:01 -0800

> If you think some tokens should be "stronger" than others, please do a
> 10-fold cross-validation testing run which should *prove* that to be the
> case.  We don't adopt Bayes tokenizer or combiner changes without
> such testing.


considering I have no idea how to do this....  or where to even begin
looking in the code.

But as it was explained to me by a friend, the idea is that spammer
URI's would change a lot less often than the textual content of their
messages.  It would allow things like the bigevil list to flow more
naturally into bayes filters, since domains that spam would quickly work
their names into the "naughty" token list (although this doesn't do much
good if the headers aren't added to bayes - I assumed that they were,
given the prominence of image-only spams these days, and how some of
them are getting caught by bayes).

> Also -- if you were so keen for answers, I think your best option would
> have been to Use The Source! ;)  We don't always have time to answer,
> and the definitive answer is right there.

Sometimes it's easier to ask a question than dig through someone else's
code, especially when it's a LOT of code and you don't really know what
you're looking for.


-- 
Chris Petersen
Programmer / Web Designer 
Silicon Mechanics:  http://www.siliconmechanics.com/
Blade Servers:      http://www.siliconmechanics.com/c292/blade-server.php
1U Servers:         http://www.siliconmechanics.com/c272/1u-server.php



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayes.

Reply via email to