> If you think some tokens should be "stronger" than others, please do a > 10-fold cross-validation testing run which should *prove* that to be the > case. We don't adopt Bayes tokenizer or combiner changes without > such testing.
considering I have no idea how to do this.... or where to even begin looking in the code. But as it was explained to me by a friend, the idea is that spammer URI's would change a lot less often than the textual content of their messages. It would allow things like the bigevil list to flow more naturally into bayes filters, since domains that spam would quickly work their names into the "naughty" token list (although this doesn't do much good if the headers aren't added to bayes - I assumed that they were, given the prominence of image-only spams these days, and how some of them are getting caught by bayes). > Also -- if you were so keen for answers, I think your best option would > have been to Use The Source! ;) We don't always have time to answer, > and the definitive answer is right there. Sometimes it's easier to ask a question than dig through someone else's code, especially when it's a LOT of code and you don't really know what you're looking for. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U Servers: http://www.siliconmechanics.com/c272/1u-server.php ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk