On Fri, 30 Jan 2004 10:55:07 -0500, Matt Kettler <[EMAIL PROTECTED]> writes:
> Today I got an interesting form of obfuscation, apparently to avoid > antidrug.cf. > > > I'm not sure wether to bother with adding rules for this, or be > satisfied that the obfuscations are so severe that the messages are > now barely legible. Its definitely barely legible, which is a VERY good thing. > Orxder your Vjiagmra and Skupter Vimagera saifely and securfely onlijne. > > Esntper Hekre My guess is to probably let bayes deal with it, but I'll speculate that bayes should be able to deal with this better if the spam probability is boosted for an unseen token inversely proportional to its edit distance from certain frequently obfuscated words. Something like: http://search.cpan.org/~jhi/String-Approx-3.23/Approx.pm or http://www.merriampark.com/ldperl.htm Plus some eval rules so that if a word is not in the bayes database, but its edit distance from 'FOOBAR' is 2, it is given a spam probability of .90, or if its edit distance from 'FOOBAR' is 1, it is given a spam probability of .95. Well, its just an idea. Scott ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk