On 08 Oct 2003 17:34:44 -0700, Daniel Quinlan <[EMAIL PROTECTED]> writes:
> Scott A Crosby <[EMAIL PROTECTED]> writes: > > > Sure. The goal of that is to add in new tokens that are unique and > > have never been seen before. Those can bias an email toward neutral. > > Bayes could also just track never-seen-before tokens as an artificial > token. The thing is that a gibberish token (not-with the statistics of $LANG, not-dictionary) should, as a new token, be given a different bayes catagory than one that is in a dictionary, etc. > My initial testing indicates that new tokens (in the body) have > a spam probability of about 0.83, at least for me. Can you do testing to see if new non-english or new non-dictionary tokens have a higher spam probability? Scott ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk