Michael Moncur wrote: MM> First, three definite problems: MM> MM> >score PORN_8 -4.248 MM> I think this rule has become nearly useless ("mp3z" and "videoz" and "warez" MM> are probably almost in common usage now) but it certainly isn't a non-spam MM> indicator of this magnitude. This one was -0.9 before. It should probably be MM> thrown out.
[craig@belphegore spamassassin]$ fgrep PORN_8 masses/freqs 48 5 43 PORN_8 So it's 5 spam and 43 nonspam occurrences for PORN_8. However, checking the lines in the nonspam.log which triggered PORN_8, there's some question in my mind as to whether some of those messages were in fact spam... I'll question the submitter of that file. MM> >score TRACKER_ID -4.215 MM> I can't understand the regex here, but I think it's broken. If it's really MM> being this much of a non-spam indicator it must be detecting something other MM> than tracking IDs. (This one was already -3.3 in 2.1. Wasn't it intended as MM> a spam indicator in the first place?) I think this was an attempt to look for unique IDs, but it's way too broad, and is picking up all kinds of things that are not unique IDs. I think we've done a lot of work on unique IDs in the subject, and probably some of that work should be applied to this body-based rule. The reason it's getting such a low score is actually pretty interesting, if you look at the attached freqs and analysis files (freqs is pretty straightforward, analysis is too, but read post-ga-analysis.pl in masses to understand it). Basically, TRACKER_ID is in 2 messages which are generating false positives, and only one false negative; it is in 16 true negatives, and 10 true positives. So the GA is really strongly incentivized to push its score down hard. MM> >score BUGZILLA_BUG 1.123 MM> I know this one was intended to be negative. The regex is probably detecting MM> lots of things that aren't Bugzilla reports and likely needs fixing. Actually, it's only being exposed to 318 messages which trigger BUGZILLA, and all 318 of those are nonspam. However, none of those messages are generating false positives, and few or none trigger any rules, so the score is being set more or less arbitrarily. 1.1 is still pretty low, though I agree it might better be set to -1. I'll set it to something -ve manually; next time the GA runs it'll be likely to just leave the score where it is unless it has good reason not to. MM> Second, a few scores that seem awfully high. I'm keeping these because my MM> threshold is 7.0, but I'd be nervous about them with a 5.0 threshold. MM> MM> score DOMAIN_BODY 4.782 MM> score EARN_PER_WEEK 4.667 MM> score FRONTPAGE 4.775 MM> score MANY_FROMS 4.409 MM> score ONE_HUNDRED_PC_GUAR 4.399 MM> score WE_HONOR_ALL 4.536 MM> MM> (My opinion: if the default threshold is 5.0 no score should be above about MM> 3.3. With the current scores, as with 2.1, a threshold of 7.0 works quite MM> well.) In this running of the GA, I added to the constraints to make it a whole lot harder for the GA to push a score up high. The scores are limited to +/- (3+gaussian-noise-of-mean(0.5)), so 4.5 being 3 sds out should be really, really unusual unless there's something heavily encouraging the survival of that gene. DOMAIN_BODY is triggered in 11 false positives (out of 239) in the corpus. DOMAIN_BODY is triggered in 43 false negatives (out of 9961) EARN_PER_WEEK is in neither false positives nor false negatives, despite appearing in 1885 spams and 1 nonspam in the corpus. FRONTPAGE has no FPs, and actually 15 FNs despite its score out of 7613 spams and 2 nonspams MANY_FROMS is in 0 FPs, 1FN, out of 12 spam and 1 nonspam. ONE_HUNDRED_PC_GUAR is in 0 FPs, 1FN, out of 2610 spam and 9 nonspam WE_HONOR_ALL is in 0 FPs and 1FN, out of 706 spams and 0 nonspam I think WE_HONOR_ALL is the only worrisome one there. MM> Third, a list of scores that should be positive but are low negatives, MM> probably indicating that the rules are no longer useful or broken. I MM> wouldn't really consider any of them good non-spam indicators. I'm setting MM> them all to zero in my local.cf file. MM> MM> score ALL_CAPS_SUBJECT -0.274 MM> score BE_AMAZED -0.260 MM> score GAPPY_TEXT -1.237 MM> score HTML_WITH_BGCOLOR -0.546 MM> score JAVASCRIPT_URI -1.607 MM> score LINES_OF_YELLING_3 -1.518 MM> score NO_EXPERIENCE -1.063 MM> score NO_QS_ASKED -0.773 MM> score OPPORTUNITY -1.010 MM> score RATWARE -0.703 MM> score REAL_THING -0.148 MM> score RELAYING_FRAME -0.584 MM> score SLIGHTLY_UNSAFE_JAVASCRIPT -0.794 MM> score SUPERLONG_LINE -0.374 MM> score SUBJ_ENDS_IN_Q_MARK -0.050 MM> score SUSPICIOUS_RECIPS -0.016 MM> score TO_BE_REMOVED_REPLY -2.150 MM> score TO_UNSUB_REPLY -1.996 MM> score WEB_BUGS -0.823 MM> score X_MSMAIL_PRIORITY_HIGH -1.356 I'm too lazy to grep for each of these in freqs and analysis, but feel free to do it yourselves. C
OVERALL SPAM NONSPAM NAME 321185 100121 221064 (all messages) 85555 58915 26640 NO_REAL_NAME 56572 54419 2153 CLICK_BELOW 44450 44255 195 CTYPE_JUST_HTML 41495 40991 504 BIG_FONT 40421 36249 4172 FROM_ENDS_IN_NUMS 38064 30926 7138 PLING 30544 30117 427 CLICK_HERE_LINK 29037 27716 1321 SUBJ_HAS_SPACES 26156 26090 66 SUBJ_HAS_UNIQ_ID 26115 25847 268 NORMAL_HTTP_TO_IP 22665 22447 218 EXCUSE_3 17662 16645 1017 MAILTO_LINK 19411 15799 3612 LINES_OF_YELLING 14755 14722 33 REMOVE_PAGE 13355 13292 63 INVALID_DATE_TZ_ABSURD 14383 12854 1529 FORGED_YAHOO_RCVD 13735 12597 1138 MSG_ID_ADDED_BY_MTA_2 12647 12582 65 REMOVE_SUBJ 12164 12122 42 MAILTO_TO_REMOVE 12714 12019 695 FROM_HAS_MIXED_NUMS 12075 11819 256 MAILTO_WITH_SUBJ 11243 10585 658 MAILTO_TO_SPAM_ADDR 10861 10527 334 SUBJ_REMOVE 12314 9498 2816 SUBJ_HAS_Q_MARK 10553 8949 1604 LINES_OF_YELLING_2 8649 8626 23 FAKED_UNDISC_RECIPS 15231 8611 6620 SUPERLONG_LINE 8109 8103 6 MAILTO_WITH_SUBJ_REMOVE 9201 7822 1379 DIFFERENT_REPLY_TO 7615 7613 2 FRONTPAGE 8366 7566 800 SUBJ_ALL_CAPS 7113 7081 32 NO_OBLIGATION 7090 6860 230 GUARANTEE 7740 6726 1014 FOR_FREE 16855 6660 10195 TO_MALFORMED 6674 6611 63 REMOVE_IN_QUOTES 6213 6074 139 HTML_WITH_BGCOLOR 6033 6004 29 TO_EMPTY 6655 5854 801 WEB_BUGS 5619 5594 25 EXCUSE_7 11011 5557 5454 MAY_BE_FORGED 5583 5514 69 OPT_IN 5896 5490 406 ONE_HUNDRED_PC_FREE 6735 5469 1266 LINES_OF_YELLING_3 5609 5436 173 BASE64_ENC_TEXT 5768 5429 339 CALL_FREE 5836 5252 584 DATE_IN_FUTURE 9713 5134 4579 PORN_10 5720 5051 669 UNSUB_PAGE 4591 4581 10 VERY_SUSP_CC_RECIPS 5308 4561 747 FORGED_HOTMAIL_RCVD 4472 4352 120 NO_COST 23245 4267 18978 SUBJ_ENDS_IN_Q_MARK 5469 4262 1207 EXCUSE_14 7511 4248 3263 DEAR_SOMEBODY 4214 4203 11 SUSPICIOUS_CC_RECIPS 4549 4188 361 WORK_AT_HOME 4636 4039 597 EXCUSE_16 3815 3785 30 VIAGRA 4449 3743 706 PLING_PLING 3874 3715 159 INVALID_DATE_NO_TZ 3793 3677 116 INVALID_MSGID 3678 3557 121 EXCUSE_1 3975 3507 468 ASCII_FORM_ENTRY 3670 3436 234 JAVASCRIPT 6065 3428 2637 FROM_AND_TO_SAME 4292 3396 896 KNOWN_BAD_DIALUPS 3369 3350 19 AS_SEEN_ON 3258 3258 0 MORTGAGE_RATES 3363 3230 133 FROM_NAME_EQ_FROM_ADDR 4462 3213 1249 COPYRIGHT_CLAIMED 3364 3048 316 MISSING_HEADERS 6340 3047 3293 EXCUSE_6 2996 2966 30 EXCUSE_10 3004 2945 59 HOME_EMPLOYMENT 3413 2906 507 PORN_11 3231 2837 394 WEIRD_PORT 4880 2780 2100 GAPPY_TEXT 2765 2683 82 CASINO 2619 2610 9 ONE_HUNDRED_PC_GUAR 2588 2537 51 VERY_SUSP_RECIPS 2611 2512 99 HTTP_USERNAME_USED 2473 2467 6 THIS_AINT_SPAM 2450 2445 5 REMOVAL_INSTRUCTIONS 2816 2402 414 CASHCASHCASH 2373 2354 19 EMAIL_MARKETING 3214 2280 934 SMTPD_IN_RCVD 2211 2211 0 CLICK_TO_REMOVE_2 2331 2146 185 PORN_4 2143 2137 6 REPLY_REMOVE_SUBJECT 2158 2081 77 SUSPICIOUS_RECIPS 2051 1993 58 SLIGHTLY_UNSAFE_JAVASCRIPT 1970 1970 0 FREE_CONSULTATION 2694 1937 757 FREE_MONEY 1931 1931 0 MSGID_SPAMSIGN_1 1913 1893 20 BULK_EMAIL 1900 1889 11 SECTION_301 1886 1885 1 EARN_PER_WEEK 1932 1884 48 US_DOLLARS_3 1827 1827 0 FAKED_IP_IN_RCVD 1819 1812 7 MONEY_BACK 1969 1772 197 OPPORTUNITY 1698 1681 17 FORGED_EUDORAMAIL_RCVD 1677 1677 0 FORGED_GW05_RCVD 1971 1636 335 PORN_12 1644 1634 10 RISK_FREE 1676 1622 54 FROM_STARTS_WITH_NUMS 1813 1613 200 PORN_3 1626 1609 17 ALL_NATURAL 1541 1536 5 EXCUSE_15 1519 1518 1 FORM_W_MAILTO_ACTION 1561 1516 45 ROUND_THE_WORLD 1593 1515 78 HTTP_ESCAPED_HOST 1501 1494 7 COPY_DVDS 1418 1410 8 REALLY_UNSAFE_JAVASCRIPT 1661 1345 316 X_PRIORITY_HIGH 1392 1335 57 DEAR_FRIEND 1309 1309 0 STRONG_BUY 1300 1300 0 WE_HATE_SPAM 1300 1284 16 UNSUB_SCRIPT 1266 1252 14 NUMERIC_HTTP_ADDR 1251 1249 2 EXCUSE_4 1238 1199 39 DOMAIN_BODY 1230 1195 35 CALL_NOW 1159 1159 0 RESISTANCE_IS_FUTILE 1154 1148 6 ADVERT_CODE 1145 1142 3 EXCUSE_12 1136 1133 3 PRINT_FORM_SIGNATURE 1115 1114 1 CBYI 1157 1084 73 INVALID_DATE_ODD_MONTH 1207 1067 140 UNDISC_RECIPS 1053 1049 4 YOUR_INCOME 1045 1045 0 BILL_1618 1054 1041 13 SOCIAL_SEC_NUMBER 1082 1037 45 FOR_JUST_SOME_AMT 1070 1031 39 GREAT_OFFER 1164 1006 158 PROFITS 1021 979 42 HTTP_WITH_EMAIL_IN_URL 973 970 3 CHECK_OR_MONEY_ORDER 5796 956 4840 TO_LOCALPART_EQ_REAL 956 948 8 SENT_IN_COMPLIANCE 926 926 0 TAKE_ACTION_NOW 1185 917 268 X_MSMAIL_PRIORITY_HIGH 907 907 0 NONEXISTENT_CHARSET 904 894 10 BUGGY_CGI 867 867 0 PORN_13 898 864 34 AMAZING 862 797 65 NO_EXPERIENCE 820 778 42 FOR_INSTANT_ACCESS 786 777 9 TO_BE_REMOVED_REPLY 768 759 9 FULL_REFUND 756 752 4 EXCUSE_13 745 726 19 MSGID_HAS_NO_AT 789 718 71 LARGE_HEX 706 706 0 WE_HONOR_ALL 712 704 8 MONEY_MAKING 688 682 6 ONE_TIME_MAILING 696 679 17 LIMITED_TIME_ONLY 682 672 10 SUBJ_FULL_OF_8BITS 661 649 12 WANTS_CREDIT_CARD 641 641 0 PARA_A_2_C_OF_1618 865 640 225 ORDER_STATUS 802 613 189 FORGED_RCVD_FOUND 622 599 23 DIRECT_EMAIL 576 575 1 YOU_HAVE_BEEN_SELECTED 614 557 57 TO_NO_USER 553 553 0 PRODUCED_AND_SENT_OUT 536 528 8 ASKS_BILLING_ADDRESS 496 496 0 MICRO_CAP_WARNING 586 483 103 AOL_USERS_LINK 456 452 4 PENIS_ENLARGE2 446 446 0 FROM_BTAMAIL 436 436 0 COPY_ACCURATELY 434 431 3 LOTS_OF_CC_LINES 558 392 166 TO_UNSUB_REPLY 391 391 0 NEW_DOMAIN_EXTENSIONS 405 373 32 MSG_ID_ADDED_BY_MTA 531 358 173 CHARSET_FARAWAY_HEADERS 356 356 0 GENTLE_FEROCITY 355 348 7 HTTP_NUMBER_WORD 391 344 47 BE_AMAZED 748 342 406 SUBJ_MISSING 325 321 4 PORN_9 327 313 14 INCREASE_TRAFFIC 349 309 40 THE_FOLLOWING_FORM 292 292 0 JODY 295 285 10 NO_QS_ASKED 305 281 24 REPLY_TO_EMPTY 260 260 0 STOCK_ALERT 260 258 2 DOMAIN_SUBJECT 287 256 31 SEE_FOR_YOURSELF 255 251 4 COMMUNIGATE 249 249 0 PREST_NON_ACCREDITED 247 247 0 ADDRESSES_ON_CD 324 245 79 RATWARE 249 243 6 FRIEND_AT_PUBLIC 240 240 0 AUTO_EMAIL_REMOVAL 869 232 637 URI_IS_POUND 228 226 2 PORN_7 220 220 0 SERIOUS_ONLY 225 218 7 MASS_EMAIL 214 214 0 NOT_INTENDED 213 212 1 TONER 211 211 0 MYCASINOBUILDER 220 209 11 BILLION_DOLLARS 190 190 0 VJESTIKA 202 190 12 EXCUSE_17 188 188 0 TRACE_BY_SSN 199 178 21 DONT_DELETE 453 163 290 X_EM_VER_PRESENT 161 161 0 PENNIES_A_DAY 269 160 109 FORGED_JUNO_RCVD 163 159 4 PORN_1 163 156 7 ONCE_IN_LIFETIME 155 155 0 KIFF 190 154 36 DATE_MISSING 164 152 12 INCREASE_SALES 147 146 1 CHARSET_FARAWAY 419 142 277 ALL_CAPS_SUBJECT 181 136 45 HTML_EMBEDS 157 129 28 PLEASE_READ 128 128 0 WWW_REMOVEYOU_COM 1116 127 989 MIME_NULL_BLOCK 122 122 0 S_1618 151 117 34 RELAYING_FRAME 141 116 25 FROM_MALFORMED 115 115 0 INVALID_DATE 133 110 23 FROM_NO_USER 99 99 0 READ_TO_END 96 96 0 EXCUSE_2 89 89 0 SPAM_FORM_RETURN 88 88 0 YELLOWSUN 86 86 0 SUBJ_2_CREDIT 84 84 0 IMPOTENCE 82 82 0 EJACULATION 80 80 0 X_PMFLAGS_PRESENT 75 75 0 INVESTOR_SPEC_SHEET 73 73 0 GREEN_EXCUSE_1 73 73 0 GREEN_EXCUSE_2 69 69 0 POST_IN_RCVD 70 66 4 PURE_PROFIT 64 64 0 HR_3113 63 61 2 X_MAILER_GIBBERISH 140 60 80 REAL_THING 60 60 0 SPAM_FORM 2656 54 2602 PGP_SIGNATURE 1465 53 1412 SIGNATURE_DELIM 52 52 0 FILTERED_BY_WORLDREMOVE 57 52 5 MONSTERHUT 94 51 43 US_DOLLARS_2 49 49 0 NO_CATCH 44 44 0 FREE_PRIORITY_MAIL 43 43 0 UCE_MAIL_ACT 42 42 0 IN_ACCORDANCE_WITH_LAWS 41 41 0 STOCK_PICK 36 36 0 WWW_DIRECTFORCEMARKETING_COM 41 36 5 US_DOLLARS 946 30 916 USER_IN_WHITELIST 55 25 30 FROM_MISSING 97191 24 97167 IN_REP_TO 30 23 7 NO_DISSAPOINTMENT 24 21 3 NIGERIAN_SCAM_2 20 20 0 PORN_6 21 20 1 SAFEGUARD_NOTICE 20 20 0 NIGERIAN_SCAM_3 19 19 0 POPLAUNCH 18 18 0 CORRUPT_MSGID 16 16 0 INTL_EXEC_GUILD 54 13 41 JAVASCRIPT_URI 17 13 4 MDAEMON_2_7_4 13 12 1 MANY_FROMS 12 11 1 URGENT_BIZ 29 11 18 TRACKER_ID 13 11 2 SHORT_RECEIVED_LINE 9 9 0 NIGERIAN_SCAM_4 13 9 4 GAPPY_SUBJECT 8 8 0 UNIVERSITY_DIPLOMAS 10 8 2 BAD_HELO_WARNING 6 6 0 NO_SELLING 6 6 0 UNNEEDED_HTML_ENCODING 5 5 0 WWW_NETSITESFORFREE_NET 48 5 43 PORN_8 5 5 0 BUGGY_CGI_PT 45 4 41 YAHOO_MSGID_ADDED 4 4 0 MURKOWSKI_CRUFT 3 3 0 WWW_TRAFFICWOW_NET 3 3 0 WWW_CLIK4YOU_COM 3 3 0 WWW_AUTOREMOVE_COM 5 3 2 EXCUSE_11 2 2 0 FROM_UGETMORE 2 2 0 EXCUSE_ES_01 2 2 0 CHARSET_FARAWAY_BODY 1 1 0 EU_EMAIL_OPTOUT 1 1 0 CYBER_FIRE_POWER 1 1 0 ONLINE_BIZ_OPS 7 1 6 ITS_EFFECTIVE 1 1 0 EXCUSE_8 1 1 0 MAIL_IN_ORDER_FORM 0 0 0 RCVD_IN_ORBS 0 0 0 HTTP_CTRL_CHARS_HOST 0 0 0 EXCUSE_18 0 0 0 CLICK_TO_REMOVE_MAILTO 0 0 0 EXCUSE_5 0 0 0 SHOES_GUY 0 0 0 BUGGY_CGI_DE_3 0 0 0 BUGGY_CGI_DE_2 0 0 0 EXCUSE_9 0 0 0 RCVD_IN_RSS 0 0 0 X_OSIRU_SPAMWARE_SITE 0 0 0 LASER_PRINTER 0 0 0 EMAIL_HARVEST 0 0 0 BUGGY_CGI_ES 0 0 0 WEB4PORNO_URL 0 0 0 FREEWEBHOSTINGCENTRAL 0 0 0 USER_IN_MORE_SPAM_TO 0 0 0 RCVD_IN_BL_SPAMCOP_NET 0 0 0 YR_MEMBERSHIP_EXCH 0 0 0 BRAND_NEW_PAGER 0 0 0 EU_200_32_CE 0 0 0 REMOVE_ES_01 0 0 0 REMOVE_ES_02 0 0 0 REMOVE_ES_03 0 0 0 REMOVE_ES_04 51 0 51 MAILMAN_CONFIRM 0 0 0 INTERNET_TERROR_RANT 318 0 318 BUGZILLA_BUG 0 0 0 SPAM_FORM_INPUT 0 0 0 CLICKSFORMONEY_NET 0 0 0 NIGERIAN_SCAM_5 0 0 0 E_WEBHOSTCENTRAL_URL 1 0 1 EVITE 4 0 4 BALANCE_FOR_LONG 0 0 0 SPAM_PHRASES_020 4 0 4 DIFF_C_PATCH 0 0 0 FREQ_SPAM_PHRASE 0 0 0 RCVD_IN_RBL 0 0 0 USER_IN_WHITELIST_TO 0 0 0 USER_IN_ALL_SPAM_TO 0 0 0 NO_MX_FOR_FROM 0 0 0 STAINLESS_STEEL 0 0 0 RCVD_IN_RELAYS_ORDB_ORG 0 0 0 25FREEMEGS_URL 0 0 0 TO_INVESTORS 0 0 0 X_OSIRU_SPAM_SRC 0 0 0 EGP_HTML_BANNER 0 0 0 RCVD_IN_VISI 0 0 0 USER_IN_BLACKLIST 0 0 0 JUST_MAILED_PAGE 0 0 0 RCVD_IN_DUL 0 0 0 SEXY_PICS 0 0 0 SPAM_PHRASES_100 0 0 0 BUGGY_CGI_ES_2 0 0 0 RAZOR_CHECK 0 0 0 A_HREF_TO_IP 1 0 1 LONG_NUMERIC_HTTP_ADDR 0 0 0 BUGGY_CGI_DE 0 0 0 NIGERIAN_SCAM 0 0 0 X_UIDL_SPAMSIGN 0 0 0 Q_FOR_SELLER 0 0 0 FREEWEBCO_NET_URL 6497 0 6497 UNIFIED_PATCH 0 0 0 RCVD_IN_RFCI 0 0 0 EXCUSE_ES_02 0 0 0 EXCUSE_ES_03 0 0 0 RCVD_IN_OSIRUSOFT_COM 0 0 0 PRINT_OUT_AND_FAX 105 0 105 MAJORDOMO 0 0 0 HUNZA_DIET_BREAD 0 0 0 PORN_2 0 0 0 ANOTHER_NET_AD 0 0 0 FROM_FORGED_HOTMAIL 0 0 0 SPAM_PHRASES_040 0 0 0 PENIS_ENLARGE
COMMON FALSE POSITIVES: (246 total) ----------------------- 0.069 2 -4.2150 TRACKER_ID 0.053 5 0.1980 US_DOLLARS_2 0.024 1 2.4290 US_DOLLARS 0.021 4 0.2480 DATE_MISSING 0.010 3 2.5150 SEE_FOR_YOURSELF 0.009 7 2.4280 SUBJ_MISSING 0.009 34 2.2700 EXCUSE_1 0.009 11 4.7820 DOMAIN_BODY 0.008 48 3.5290 UNSUB_PAGE 0.007 45 -0.8230 WEB_BUGS 0.007 1 2.0700 CHARSET_FARAWAY 0.007 1 -0.5840 RELAYING_FRAME 0.007 7 1.3610 GREAT_OFFER 0.007 4 1.9280 TO_NO_USER 0.006 1 0.9850 PLEASE_READ 0.006 1 2.9300 INCREASE_SALES 0.006 10 1.6580 X_PRIORITY_HIGH 0.006 6 4.0620 HTTP_WITH_EMAIL_IN_URL 0.006 7 1.1640 UNDISC_RECIPS 0.006 1 0.2650 HTML_EMBEDS 0.005 7 0.4200 UNSUB_SCRIPT 0.005 4 1.8410 EXCUSE_13 0.004 1 1.1660 MASS_EMAIL 0.004 4 4.0660 BUGGY_CGI 0.004 3 3.1360 SUBJ_FULL_OF_8BITS 0.004 5 -1.3560 X_MSMAIL_PRIORITY_HIGH 0.004 17 -1.5680 COPYRIGHT_CLAIMED 0.004 2 1.8170 CHARSET_FARAWAY_HEADERS 0.004 22 0.8770 FROM_AND_TO_SAME 0.004 7 2.6950 US_DOLLARS_3 0.004 2 -1.9960 TO_UNSUB_REPLY 0.004 5 3.2950 REALLY_UNSAFE_JAVASCRIPT 0.003 3 1.9160 ORDER_STATUS 0.003 19 1.0520 CALL_FREE 0.003 17 1.1480 EXCUSE_14 0.003 1 -0.7030 RATWARE 0.003 23 -0.4680 DEAR_SOMEBODY 0.003 43 -0.3740 SUPERLONG_LINE 0.003 10 1.7120 JAVASCRIPT 0.002 14 1.3800 EXCUSE_7 0.002 1 0.9820 MSG_ID_ADDED_BY_MTA 0.002 16 1.9300 REMOVE_IN_QUOTES 0.002 1 -0.2740 ALL_CAPS_SUBJECT 0.002 2 0.8690 URI_IS_POUND 0.002 1 0.8320 X_EM_VER_PRESENT 0.002 7 0.0540 WEIRD_PORT 0.002 10 1.3450 EXCUSE_16 0.002 13 -0.5460 HTML_WITH_BGCOLOR 0.002 22 0.5790 LINES_OF_YELLING_2 0.002 16 0.2110 FOR_FREE 0.002 17 1.9330 SUBJ_ALL_CAPS 0.002 6 0.5560 EXCUSE_10 0.002 13 -1.5180 LINES_OF_YELLING_3 0.002 24 2.3450 REMOVE_SUBJ 0.002 10 0.5300 FORGED_HOTMAIL_RCVD 0.002 3 1.8490 HTTP_ESCAPED_HOST 0.002 42 2.7470 EXCUSE_3 0.002 2 0.1570 MIME_NULL_BLOCK 0.002 6 2.3910 FROM_NAME_EQ_FROM_ADDR 0.002 6 0.7790 PORN_11 0.002 53 1.7880 CLICK_HERE_LINK 0.002 2 1.0600 ADVERT_CODE 0.002 4 1.3910 PORN_4 0.002 33 0.4530 LINES_OF_YELLING 0.002 23 2.4050 MSG_ID_ADDED_BY_MTA_2 0.002 8 -1.2370 GAPPY_TEXT 0.002 1 2.2830 DIRECT_EMAIL 0.002 5 1.2600 SMTPD_IN_RCVD 0.002 64 2.0850 BIG_FONT 0.001 83 1.5200 CLICK_BELOW 0.001 3 -0.7940 SLIGHTLY_UNSAFE_JAVASCRIPT 0.001 1 2.6480 ONE_TIME_MAILING 0.001 2 2.0690 DEAR_FRIEND 0.001 6 1.0360 NO_COST 0.001 1 1.2730 MSGID_HAS_NO_AT 0.001 59 3.1540 CTYPE_JUST_HTML 0.001 33 3.0220 NORMAL_HTTP_TO_IP 0.001 1 -1.1540 FORGED_RCVD_FOUND 0.001 12 0.0660 PORN_10 0.001 18 3.5000 REMOVE_PAGE 0.001 7 2.5410 TO_EMPTY 0.001 1 1.6310 AMAZING 0.001 3 0.1820 CASHCASHCASH 0.001 1 3.3580 CHECK_OR_MONEY_ORDER 0.001 12 -0.3100 MAILTO_WITH_SUBJ 0.001 37 0.5440 PLING 0.001 83 0.6320 NO_REAL_NAME 0.001 1 2.3050 SOCIAL_SEC_NUMBER 0.001 2 0.2880 REPLY_REMOVE_SUBJECT 0.001 1 0.7830 FOR_JUST_SOME_AMT 0.001 10 0.0620 SUBJ_REMOVE 0.001 11 1.3410 MAILTO_TO_REMOVE 0.001 5 2.1000 OPT_IN 0.001 1 0.5720 INVALID_DATE_ODD_MONTH 0.001 5 2.3180 DATE_IN_FUTURE 0.001 1 2.1960 CALL_NOW 0.001 14 0.0440 MAILTO_LINK 0.001 3 1.8640 INVALID_DATE_NO_TZ 0.001 11 1.9980 FORGED_YAHOO_RCVD 0.001 7 0.0370 DIFFERENT_REPLY_TO 0.001 30 1.0290 FROM_ENDS_IN_NUMS 0.001 8 1.3410 MAY_BE_FORGED 0.001 4 1.4410 BASE64_ENC_TEXT 0.001 4 2.3970 ONE_HUNDRED_PC_FREE 0.001 3 0.3900 PLING_PLING 0.001 1 2.7460 COPY_DVDS 0.001 19 2.7410 SUBJ_HAS_SPACES 0.001 1 0.3770 EXCUSE_15 0.001 8 1.9520 FROM_HAS_MIXED_NUMS 0.001 1 2.1510 RISK_FREE 0.001 1 1.2880 FROM_STARTS_WITH_NUMS 0.001 2 0.9320 MISSING_HEADERS 0.001 1 0.6050 PORN_3 0.001 2 1.5230 INVALID_MSGID 0.001 1 1.1130 BULK_EMAIL 0.001 3 0.5970 TO_LOCALPART_EQ_REAL 0.001 1 -1.0100 OPPORTUNITY 0.001 1 0.6260 PORN_12 0.001 2 0.0360 ASCII_FORM_ENTRY 0.000 2 1.3080 KNOWN_BAD_DIALUPS 0.000 1 -0.0160 SUSPICIOUS_RECIPS 0.000 5 1.2750 MAILTO_TO_SPAM_ADDR 0.000 3 1.8950 GUARANTEE 0.000 1 0.7150 EMAIL_MARKETING 0.000 7 0.3310 TO_MALFORMED 0.000 1 1.4740 THIS_AINT_SPAM 0.000 1 0.2030 HTTP_USERNAME_USED 0.000 1 1.0040 FREE_MONEY 0.000 1 1.6050 CASINO 0.000 8 -0.0500 SUBJ_ENDS_IN_Q_MARK 0.000 1 2.0400 HOME_EMPLOYMENT 0.000 4 1.0210 SUBJ_HAS_Q_MARK 0.000 2 -0.1100 EXCUSE_6 0.000 1 2.1660 AS_SEEN_ON 0.000 2 2.5510 NO_OBLIGATION 0.000 2 1.8640 MAILTO_WITH_SUBJ_REMOVE 0.000 1 2.4960 SUSPICIOUS_CC_RECIPS 0.000 1 0.3650 WORK_AT_HOME 0.000 1 1.5720 VERY_SUSP_CC_RECIPS 0.000 5 2.0370 SUBJ_HAS_UNIQ_ID 0.000 1 3.4350 FAKED_UNDISC_RECIPS COMMON FALSE NEGATIVES: (9962 total) ----------------------- 1.000 2 1.0000 EXCUSE_ES_01 0.290 154 1.8170 CHARSET_FARAWAY_HEADERS 0.194 1133 2.3180 DATE_IN_FUTURE 0.154 2 2.6700 GAPPY_SUBJECT 0.138 25 0.2650 HTML_EMBEDS 0.122 3180 3.0220 NORMAL_HTTP_TO_IP 0.082 46 -1.9960 TO_UNSUB_REPLY 0.079 12 -0.5840 RELAYING_FRAME 0.074 7 0.1980 US_DOLLARS_2 0.074 4 -1.6070 JAVASCRIPT_URI 0.077 1 4.4090 MANY_FROMS 0.059 19 -0.7030 RATWARE 0.056 33 0.9010 AOL_USERS_LINK 0.053 40 2.4280 SUBJ_MISSING 0.045 7 0.9850 PLEASE_READ 0.044 102 1.3910 PORN_4 0.043 13 4.3350 REPLY_TO_EMPTY 0.041 310 -0.4680 DEAR_SOMEBODY 0.042 1 4.3150 NIGERIAN_SCAM_2 0.037 166 -1.5680 COPYRIGHT_CLAIMED 0.036 238 -0.8230 WEB_BUGS 0.035 43 4.7820 DOMAIN_BODY 0.034 1305 0.5440 PLING 0.033 9 2.0270 FORGED_JUNO_RCVD 0.034 1 -4.2150 TRACKER_ID 0.033 15 0.8320 X_EM_VER_PRESENT 0.032 30 -100.0000 USER_IN_WHITELIST 0.031 88 0.1820 CASHCASHCASH 0.031 7 1.1660 MASS_EMAIL 0.029 12 -0.2740 ALL_CAPS_SUBJECT 0.028 434 -0.3740 SUPERLONG_LINE 0.027 156 0.5970 TO_LOCALPART_EQ_REAL 0.025 480 0.4530 LINES_OF_YELLING 0.024 15 1.9280 TO_NO_USER 0.024 1 2.4290 US_DOLLARS 0.023 113 -1.2370 GAPPY_TEXT 0.022 24 0.7830 FOR_JUST_SOME_AMT 0.022 1 0.8270 YAHOO_MSGID_ADDED 0.020 54 -2.0950 PGP_SIGNATURE 0.020 39 2.6950 US_DOLLARS_3 0.020 274 2.4050 MSG_ID_ADDED_BY_MTA_2 0.020 208 0.5790 LINES_OF_YELLING_2 0.019 15 -2.1500 TO_BE_REMOVED_REPLY 0.019 128 -1.5180 LINES_OF_YELLING_3 0.019 147 0.2110 FOR_FREE 0.019 21 0.1570 MIME_NULL_BLOCK 0.017 21 1.1640 UNDISC_RECIPS 0.017 5 2.5150 SEE_FOR_YOURSELF 0.017 20 1.0600 ADVERT_CODE 0.017 67 1.8640 INVALID_DATE_NO_TZ 0.017 7 0.9820 MSG_ID_ADDED_BY_MTA 0.017 15 0.8690 URI_IS_POUND 0.017 99 1.0520 CALL_FREE 0.016 60 1.7120 JAVASCRIPT 0.016 14 1.9160 ORDER_STATUS 0.016 145 0.0370 DIFFERENT_REPLY_TO 0.016 1 0.9050 X_MAILER_GIBBERISH 0.015 1318 0.6320 NO_REAL_NAME 0.015 51 0.9320 MISSING_HEADERS 0.015 342 -0.0500 SUBJ_ENDS_IN_Q_MARK 0.015 594 1.0290 FROM_ENDS_IN_NUMS 0.014 76 1.1480 EXCUSE_14 0.014 51 2.2700 EXCUSE_1 0.013 230 0.0440 MAILTO_LINK 0.013 13 4.0620 HTTP_WITH_EMAIL_IN_URL 0.013 159 1.9520 FROM_HAS_MIXED_NUMS 0.012 42 0.7790 PORN_11 0.012 64 0.5300 FORGED_HOTMAIL_RCVD 0.012 53 0.3900 PLING_PLING 0.012 99 1.9330 SUBJ_ALL_CAPS 0.011 16 2.0690 DEAR_FRIEND 0.011 8 0.8410 LIMITED_TIME_ONLY 0.011 70 -0.5460 HTML_WITH_BGCOLOR 0.011 12 1.3610 GREAT_OFFER 0.011 630 1.5200 CLICK_BELOW 0.011 10 4.0660 BUGGY_CGI 0.011 66 0.8770 FROM_AND_TO_SAME 0.010 46 1.3450 EXCUSE_16 0.010 33 2.3910 FROM_NAME_EQ_FROM_ADDR 0.010 94 0.0660 PORN_10 0.009 2 1.0720 BILLION_DOLLARS 0.009 110 1.0210 SUBJ_HAS_Q_MARK 0.009 8 1.6310 AMAZING 0.009 7 -1.1540 FORGED_RCVD_FOUND 0.009 28 1.2600 SMTPD_IN_RCVD 0.009 124 1.9980 FORGED_YAHOO_RCVD 0.008 14 1.6580 X_PRIORITY_HIGH 0.008 5 2.2830 DIRECT_EMAIL 0.008 36 0.3650 WORK_AT_HOME 0.008 1 0.8990 FROM_NO_USER 0.007 15 -0.7940 SLIGHTLY_UNSAFE_JAVASCRIPT 0.007 81 1.2750 MAILTO_TO_SPAM_ADDR 0.007 40 1.4410 BASE64_ENC_TEXT 0.007 2 -0.7730 NO_QS_ASKED 0.007 1 2.0700 CHARSET_FARAWAY 0.007 73 1.3410 MAY_BE_FORGED 0.006 1 2.9300 INCREASE_SALES 0.006 16 1.0040 FREE_MONEY 0.006 181 1.7880 CLICK_HERE_LINK 0.005 9 1.2880 FROM_STARTS_WITH_NUMS 0.005 6 2.8470 EXCUSE_12 0.005 6 0.5720 INVALID_DATE_ODD_MONTH 0.005 10 -1.0100 OPPORTUNITY 0.005 28 3.5290 UNSUB_PAGE 0.005 21 1.0360 NO_COST 0.005 12 0.2030 HTTP_USERNAME_USED 0.004 200 3.1540 CTYPE_JUST_HTML 0.004 3 3.1360 SUBJ_FULL_OF_8BITS 0.004 7 1.8490 HTTP_ESCAPED_HOST 0.004 8 1.1130 BULK_EMAIL 0.004 17 1.3080 KNOWN_BAD_DIALUPS 0.004 15 1.5230 INVALID_MSGID 0.004 22 2.1000 OPT_IN 0.004 3 1.1080 LARGE_HEX 0.004 9 0.7150 EMAIL_MARKETING 0.004 15 0.0360 ASCII_FORM_ENTRY 0.004 12 0.0540 WEIRD_PORT 0.003 143 2.0850 BIG_FONT 0.003 6 1.4890 MONEY_BACK 0.003 9 1.6050 CASINO 0.003 4 2.1960 CALL_NOW 0.003 8 2.4360 VERY_SUSP_RECIPS 0.003 1 2.7440 INCREASE_TRAFFIC 0.003 36 -0.3100 MAILTO_WITH_SUBJ 0.003 2 2.6480 ONE_TIME_MAILING 0.003 1 1.7070 THE_FOLLOWING_FORM 0.003 4 3.2950 REALLY_UNSAFE_JAVASCRIPT 0.003 2 2.4900 MONEY_MAKING 0.003 6 0.2880 REPLY_REMOVE_SUBJECT 0.003 6 -0.0160 SUSPICIOUS_RECIPS 0.003 8 0.5560 EXCUSE_10 0.003 2 2.8460 FULL_REFUND 0.003 3 4.9480 RESISTANCE_IS_FUTILE 0.003 4 2.2480 ROUND_THE_WORLD 0.003 1 -0.2600 BE_AMAZED 0.003 18 1.8950 GUARANTEE 0.003 18 2.5510 NO_OBLIGATION 0.002 15 2.5410 TO_EMPTY 0.002 30 1.3410 MAILTO_TO_REMOVE 0.002 2 -1.0630 NO_EXPERIENCE 0.002 14 -0.1100 EXCUSE_6 0.002 4 0.6050 PORN_3 0.002 27 2.3450 REMOVE_SUBJ 0.002 8 2.3830 VIAGRA 0.002 34 0.3310 TO_MALFORMED 0.002 6 2.0400 HOME_EMPLOYMENT 0.002 15 4.7750 FRONTPAGE 0.002 2 2.3050 SOCIAL_SEC_NUMBER 0.002 1 2.6270 ASKS_BILLING_ADDRESS 0.002 54 2.7410 SUBJ_HAS_SPACES 0.002 3 2.1510 RISK_FREE 0.002 12 1.9300 REMOVE_IN_QUOTES 0.002 2 0.4600 PROFITS 0.002 10 2.3970 ONE_HUNDRED_PC_FREE 0.002 2 0.8640 EXCUSE_4 0.002 3 0.1160 SECTION_301 0.002 5 4.4050 MORTGAGE_RATES 0.002 1 1.5320 WANTS_CREDIT_CARD 0.001 1 4.5360 WE_HONOR_ALL 0.001 15 0.0620 SUBJ_REMOVE 0.001 1 1.2730 MSGID_HAS_NO_AT 0.001 2 2.7460 COPY_DVDS 0.001 1 1.8410 EXCUSE_13 0.001 2 1.0240 ALL_NATURAL 0.001 1 2.5540 FOR_INSTANT_ACCESS 0.001 2 2.5480 FORGED_EUDORAMAIL_RCVD 0.001 1 4.1940 PORN_13 0.001 29 2.0370 SUBJ_HAS_UNIQ_ID 0.001 1 1.7440 SENT_IN_COMPLIANCE 0.001 1 3.3580 CHECK_OR_MONEY_ORDER 0.001 23 2.7470 EXCUSE_3 0.001 2 0.6260 PORN_12 0.001 13 2.1260 INVALID_DATE_TZ_ABSURD 0.001 1 4.8270 BILL_1618 0.001 5 1.3800 EXCUSE_7 0.001 1 -1.3560 X_MSMAIL_PRIORITY_HIGH 0.001 1 2.4170 NUMERIC_HTTP_ADDR 0.001 1 0.4200 UNSUB_SCRIPT 0.001 1 0.3770 EXCUSE_15 0.000 1 1.4740 THIS_AINT_SPAM 0.000 1 4.3990 ONE_HUNDRED_PC_GUAR 0.000 3 1.8640 MAILTO_WITH_SUBJ_REMOVE 0.000 3 3.4350 FAKED_UNDISC_RECIPS 0.000 5 3.5000 REMOVE_PAGE 0.000 24 -4.4310 IN_REP_TO 0.000 1 2.4960 SUSPICIOUS_CC_RECIPS 0.000 1 1.5720 VERY_SUSP_CC_RECIPS