Thursday, December 4, 2003, 10:23:38 PM, I responded to Jon Gerdes' earlier email:
JG>> I'm still ploughing through some of your rules. I've ammended your JG>> CONSFROM9 rule somewhat: JG>> # A From without vowels is probably invalid JG>> header WHL_CONSFROM From =~ /[EMAIL PROTECTED],20}\b/i JG>> describe WHL_CONSFROM From contains word consisting of consecutive consonants JG>> score WHL_CONSFROM 1.5 JG>> I just inverted the logic to NOT vowel and added in the @ symbol. JG>> This address got picked up: JG>> [EMAIL PROTECTED] JG>> OK could just up the limits in {} but I think that would miss the JG>> point of the rule. What do you think. I ran your rule, with and without \ before the @, against my corpus. Results for both: 6988s/218h of 62626 corpus That compares to my RM_fl_ConsWord9 rule's 50s/0h of 63143 corpus Per my admittedly arbitrary algorithm documented at http://www.exit0.us/index.php/RM_RuleScoring I give my ConsWord9 rule a 1.50 score, and your ConsFrom rule a 1.319 score. Those ham hits really bring down a score. (That's why my consonant list, bcghjklmnpqrstvwxz, excludes d and f -- too many hits in ham.) I don't know what the impact would be on FPs ... how many other rules would hit a ham that matched your ConsFrom rule? If few, then your rule with its >6k spam hit would be better. If many, then mine with no ham hits is better. I'm not prepared at this time to try to analyze that hit pattern. Bob Menschel ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk