Have there been any suggestions that one of the rules be something like "percentage of mis-spelled words" in the text section? In particular, a high score on this would remove most of the problems seen below. If the site could configure SA to use aspell with a given set of dictionaries that were set up for the site, that might be a nice rule to add to the list.
tom -- tom satter - just plain old tom (303) 543-7623 (home) Matt Kettler said: > At 12:25 PM 10/9/2003, Eric Vollmer wrote: > >>My question is, what is the threshold for subject/body text like >> V(A)G1NAS >>or C()CKS to actually >>invoke a score to be added to the overall score? > > There is no static rule in the current ruleset that will ever add score > for > those particular phrases. However, the bayes tokenizer does a very good > job > if it's been trained on this stuff. But Bayes aside, I can send you an > email that contains 100's of instances of the string "c()cks" and never > get > a point for it. > > The problem is that these are forms of text obfuscation. And there's > thousands of possible kinds of text obfuscation.. To try and code rules > for > each and every possible way to do it isn't practical. I've done a few > custom rules for some of the more common ones, especially the spaced-out > ones where they use .'s or _'s between every letter, but even my setup > doesn't catch every possibility. > > It may be possible to eventually add an eval test that searches for a lot > of different kinds of obfuscated text, but right now it's not possible > with > a simple rule. It's almost like you want a "deobfuscated_body" ruletype > where the message is scanned several times for a string with various kinds > of de-mangling done in advance. > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > SourceForge.net hosts over 70,000 Open Source Projects. > See the people who have HELPED US provide better services: > Click here: http://sourceforge.net/supporters.php > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk