> -----Original Message----- > From: Chris Santerre [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 30, 2003 3:42 PM > To: Dallas L. Engelken; [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: RE: [SAtalk] Spell Checking the Subject Header (RESULTS) > > > WOW!!! Nice work!! >
thank you > How did it handle things not found in the dictionary? Like > LFHDJFHFJ$*? I didn't look at the code close enough :) > it basically takes the subject and splits it based on word boundaries... Subject: This is COOOL becomes @words = ('This','is','COOOL'); like i said... its a quick hack just to get some decent information out of it, hopefully. then a foreach is ran on @words and pspell checks each $word against the dict. i only used en_US for the test... but you could easily take the language detection out of SA and plug in a variable for what language it was. of course you'd need the appropriate dicts (http://ftp.gnu.org/gnu/aspell/dict/) for all the languages that can be detected. so to answer your question, if the subject was Subject: Random characters LFHDJFHFJ$*? in subject it would have a $notfound_perc = 20.0000% (1 out of 5 words mispelled/unknown) and match the rule header SUBJ_SPELLING_20 eval:spell_check_subject('20','30') describe SUBJ_SPELLING_20 20-29% mis-spelled words in subject maybe there is ways to improve this... i dunno. i just blew a half-day on it cuz i had nothing better to do :) d ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk