Soundex might be a practical solution. Perhaps a manageable approach is to first apply a spelling check using both a regular dictionary and augmenting it with a set of spammer mis-spellings. Then, send the output of that step into Soundex. The Soundex is a heuristic for catching the creative alternative spellings that didn't make it into the custom spelling dictionary, or need not be put in the custom dictionary because soundex picks up the alternatives.
> -----Original Message----- > From: David B Funk > Sent: Wednesday, December 10, 2003 4:01 PM > > What might be easier to implement would be an enhanced version of > the "soundex" transformation (see Text::Soundex module). > > The El337 version of soundex would know about the various > grapical character to sounds mappings and return results that > would be appropriate. > > The only difficulty I can see would be dealing with the ambiguity > factor. (EG is '14all' -> "one-for-all" or "Laall" ). > ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk