Soundex might be a practical solution. Perhaps a manageable approach
is to first apply a spelling check using both a regular dictionary
and augmenting it with a set of spammer mis-spellings. Then, send the
output of that step into Soundex. The Soundex is a heuristic for catching
the creative alternative spellings that didn't make it into the custom
spelling dictionary, or need not be put in the custom dictionary because
soundex picks up the alternatives.

> -----Original Message-----
> From: David B Funk
> Sent: Wednesday, December 10, 2003 4:01 PM
> 
> What might be easier to implement would be an enhanced version of
> the "soundex" transformation (see Text::Soundex module).
> 
> The El337 version of soundex would know about the various
> grapical character to sounds mappings and return results that
> would be appropriate.
> 
> The only difficulty I can see would be dealing with the ambiguity
> factor. (EG is '14all' -> "one-for-all" or "Laall" ).
> 



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to