Last month I offered some header rules for possible inclusion in a future distribution. Those that passed muster have been formally submitted via bugzilla.
I've now completed review of my "body phrase" rule set, and feel they're ready for similar review. Please look over and test the following rules, and let me know if they work for you. Use your own scoring -- my scores tend to be high, since I use a 9.0 spam threshold. With a 5.0 threshold you may want to cut my scores in half. Note to those who wonder: this is only a small extract of the rules I use, and have posted to http://www.exit0.us/index.php/RM_PhraseRules -- per discussion with the developers, I'm submitting here only those rules that hit at least 1% of all emails, and where at least 97% of the hits are spam. # !! RM_BPC -- Commerce and Marketing-related Spam Phrasing body RE_bpc_BestOnline /best online/i describe RE_bpc_BestOnline Found Phrase 'best online' score RE_bpc_BestOnline 1.990 # 99s/0h of 74869 corpus # "best online mortage" would also hit MORTGAGE_BEST body RM_bpc_LowCost /low cost/i describe RM_bpc_LowCost mentions low cost score RM_bpc_LowCost 1.650 # 520s/7h of 74869 corpus # ham: autoweb.com May 2001, resume, valid emails(2), valid email newsletter (3) body RM_bpc_OrderToday /order today/i describe RM_bpc_OrderToday says you should order today score RM_bpc_OrderToday 1.832 # 749s/8h of 74869 corpus # ham: valid marketing newsletters body RE_bpc_WithYourPurch /with your purchase/i describe RE_bpc_WithYourPurch Contains phrase With Your Purchase score RE_bpc_WithYourPurch 1.595 # 119s/1h of 74869 corpus; ham: drugstore.com # !! RM_BPE -- Education-related Spam Phrasing body RM_bpe_HighSchlDip /high school diploma/i describe RM_bpe_HighSchlDip mentions a high school diploma score RM_bpe_HighSchlDip 1.347 # 104s/2h of 74869 corpus; ham: resumes # !! RM_BPF -- Finance and Money-related Spam Phrasing body RM_bpf_DebtElim /D.?e.?b.?t\WE.?l.?i.?m.?i.?n.?a.?t.?i.?o.?n/i describe RM_bpf_DebtElim Debt Elimination score RM_bpf_DebtElim 3.000 # 382s/0h of 74869 corpus; may also match distrib CONSOLIDATE_DEBT body RE_bpf_DebtGetOutOf /out of debt/i describe RE_bpf_DebtGetOutOf Get out of debt! score RE_bpf_DebtGetOutOf 1.770 # 77s/0h of 74869 corpus body RM_bpf_FwdLkngStmts /forward[ -]looking statements/i describe RM_bpf_FwdLkngStmts Contains phrasing used by stock market spammers score RM_bpf_FwdLkngStmts 6.000 # 424s/0h of 74869 corpus body RM_bpf_LotsOfLenders /hundreds of [LI]enders/i describe RM_bpf_LotsOfLenders mentions lots of (mortgage) lenders score RM_bpf_LotsOfLenders 3.000 # 272s/0h of 39989 corpus; 161s/0h of 74869 corpus body RM_bpf_MillBucks2 /Million (?:USD|United States Dollars)/i describe RM_bpf_MillBucks2 mentions several million dollars score RM_bpf_MillBucks2 3.000 # 204s/0h of 74869 corpus body RM_bpf_MillBucks3 /Million.{1,30}Dollars/i describe RM_bpf_MillBucks3 mentions several million dollars score RM_bpf_MillBucks3 1.618 # 927s/14h of 74869 corpus # ham: valid emails & news reports body RM_bpf_NoTurnDown /no one is turned down/i describe RM_bpf_NoTurnDown Mortgage, Loan, or Insurance qualification score RM_bpf_NoTurnDown 3.000 # 334s/0h of 74869 corpus # !! RM_BPI -- Insurance, Warranty, and similar Spam Phrasing body RM_bpi_LifeInsur /\blife\W{0,3}ins/i describe RM_bpi_LifeInsur mentions life insurance score RM_bpi_LifeInsur 2.013 # 304s/2h of 74869 corpus; ham: valid emails body RM_bpi_LowestRates /lowest rates/i describe RM_bpi_LowestRates Contains spammer phrasing - insurance or mortgage score RM_bpi_LowestRates 3.000 # 654s/1h of 74869 corpus; ham: Marriott Rewards body RM_bpi_LowestRatesa /lowest rates available/i describe RM_bpi_LowestRatesa Contains spammer phrasing - insurance or mortgage score RM_bpi_LowestRatesa 1.110 # 110s/0h of 74869 corpus; add to LowestRates body RM_bpi_NoObligQuotei /n.?o.?o.?b.?l.?i.?g.?a.?t.?i.?o.?n.?q.?u.?o.?t.?e/i describe RM_bpi_NoObligQuotei Offers a no-obligation quote score RM_bpi_NoObligQuotei 3.000 # 216s/0h of 74869 corpus body RM_bpi_ProtectFam /Protect your family/i describe RM_bpi_ProtectFam Spammer phrasing -- insurance score RM_bpi_ProtectFam 2.050 # 105s/0h of 74869 corpus # !! RM_BPM -- Medical or Biological Spam Phrasing body RM_bpm_FreeMedConsult /Free medical consultation/i describe RM_bpm_FreeMedConsult offers a free medical consultation. score RM_bpm_FreeMedConsult 1.900 # 90s/0h of 74869 corpus body RM_bpm_MagicLubricant /"Magic Lubricant"/i describe RM_bpm_MagicLubricant Spammer phrasing in body of email score RM_bpm_MagicLubricant 7.000 # 198s/0h of 74869 corpus body RM_bpm_MoreEnergy /More energy/i describe RM_bpm_MoreEnergy talks about having or generating more energy score RM_bpm_MoreEnergy 2.610 # 161s/0h of 74869 corpus body RM_bpm_MultipleOrgasms /multiple orgasms/i describe RM_bpm_MultipleOrgasms Spammer phrasing in body of email score RM_bpm_MultipleOrgasms 3.000 # 224s/0h of 74869 corpus body RM_bpm_NoEmbarrassing /no embarrassing/i # From Emporium describe RM_bpm_NoEmbarrassing Wow, I won't be embarrassed anymore! score RM_bpm_NoEmbarrassing 3.000 # 227s/0h of 68055 corpus; 215s/0h of 74869 corpus body RM_bpm_PowerBottle /"Power Bottle"/i describe RM_bpm_PowerBottle Spammer phrasing in body of email score RM_bpm_PowerBottle 7.000 # 198s/0h of 74869 corpus body RM_bpm_PrescrMeds /Prescription Medications/i describe RM_bpm_PrescrMeds seems to discuss prescription medications score RM_bpm_PrescrMeds 3.000 # FP: 1293s/2h of 74869 corpus; ham: email to employer health insur, drugstore.com body RM_bpm_SideEffects /side effects/i # From Emporium describe RM_bpm_SideEffects Has Side Effects score RM_bpm_SideEffects 1.984 # 984s/9h of 74869 corpus; ham: valid emails, drugstore.com, howstuffworks.com body RM_bpm_USDoctors /[Uü].?S.? (?:Licensed)?.?(?:Doctors?|Physicians?|Pharmac(?:y|ies))/i describe RM_bpm_USDoctors mentions U.S. doctor(s) or pharmacy(s) score RM_bpm_USDoctors 3.00 # 2824s/3h of 74869 corpus; ham: valid emails and newsletters # !! RM_BPN -- Nigerian (and other) Scam-related Spam Phrasing body RM_bpn_AsAForeigner /\b(?:who was a|as a|an? honest|you being a|to any) foreigner/i describe RM_bpn_AsAForeigner contains apparent spammer reference to a foreigner score RM_bpn_AsAForeigner 1.790 # 79s/0h of 74869 corpus body RM_bpn_Confidential /(?:total(?:ly)?|VERY|strictly|high(?:est|ly)?|utmost) CONFIDEN(?:ce|T(?:AI|IA)L)/i describe RM_bpn_Confidential says this is very confidential score RM_bpn_Confidential 1.616 # 431s/6h of 74869 corpus; ham: membership list, survey confidentiality, body RM_bpn_ForeignAcct /foreign (?:offshore )?(?:bank|account)/i describe RM_bpn_ForeignAcct mentions a foreign account score RM_bpn_ForeignAcct 2.800 # 180s/0h of 74869 corpus body RM_bpn_FreeCableTV /Free Cable.{0,4}TV/i describe RM_bpn_FreeCableTV Spammer phrasing or subject found in body of email score RM_bpn_FreeCableTV 3.000 # 360s/0h of 74869 corpus body RM_bpn_PercentageSpam /(?:(?:negotiate|reasonable|acc?or?ding|certain|agg?ree).{1,10}percentage|percentage.{1,10}(?:indicat|previous|involved)|your percentage will)/i describe RM_bpn_PercentageSpam mentions percentage(s) in a spam-like way score RM_bpn_PercentageSpam 1.257 # 145 spam, 0 ham, Sep 1 2003; 94s/0h of 39989 corpus; 84s/1h of 63143 corpus; 77s/2h of 74869 corpus body RM_bpn_SecurityComp /security (?:company|storage house)/i describe RM_bpn_SecurityComp mentions a security company score RM_bpn_SecurityComp 1.850 # 255s/2h of 74869 corpus; ham: valid emails body RM_bpn_TotalSum /The total sum/i describe RM_bpn_TotalSum mentions some total sum score RM_bpn_TotalSum 2.520 # 152s/0h of 74869 corpus body RM_bpn_UrgentReply /(?:urgent reply|reply urgent)/i describe RM_bpn_UrgentReply requests an "urgent" reply score RM_bpn_UrgentReply 2.210 # 121s/0h of 74869 corpus # !! RM_BPP -- Porn-related and Adult-related Spam Phrasing body RM_bpp_AdultMovie /[EMAIL PROTECTED] ?m[o0]vie/i describe RM_bpp_AdultMovie mentions adult movie(s) score RM_bpp_AdultMovie 6.000 # 1841s/0h of 74869 corpus # !! RM_BPQ -- Privacy, Identity Theft, Copyright, Online Security-related Spam Phrasing body RM_bpq_BannedCD /b\s?a\s?n\s?n\s?e\s?d\s?c\s?d/i describe RM_bpq_BannedCD mentions the supposedly banned CD score RM_bpq_BannedCD 3.000 # 910s/0h of 74869 corpus body RM_bpq_CopyDVD /c[o0]py dvd/i describe RM_bpq_CopyDVD seems to mention copying DVDs score RM_bpq_CopyDVD 3.000 # 216s/0h of 74869 corpus body RM_bpq_DVDPro /\bDVD\W{0,3}pro\b/i describe RM_bpq_DVDPro mentions DVD Pro -- DVD copying software score RM_bpq_DVDPro 1.760 # 69 spam, 0 ham, Aug 9 2003; 120s/0h of 39989 corpus; 98s/0h of 63143 corpus; 76s/0h of 74869 corpus body RM_bpq_SpyOnAnyone /Spy on Anyone/i describe RM_bpq_SpyOnAnyone suggests you can spy on anyone using spam product score RM_bpq_SpyOnAnyone 2.020 # 102s/0h of 74869 corpus body RM_bpq_SpySoftware /Spy Software/ describe RM_bpq_SpySoftware mentions known spam product score RM_bpq_SpySoftware 1.990 # 99s/0h of 74869 corpus # !! RM_BPS -- Spam-related Spam Phrasing body RM_bps_RemoveMail /Remove mail/i describe RM_bps_RemoveMail seems to offer a "Remove mail" link score RM_bps_RemoveMail 1.750 # 79s/0h of 63143 corpus; 75s/0h of 74869 corpus body RM_bps_SpamRemedy /Spam.?Remedy/ describe RM_bps_SpamRemedy mentions known spam product score RM_bps_SpamRemedy 9.000 # 194s/0h of 74869 corpus body RM_bps_WeHonor /we honor/i # From Emporium describe RM_bps_WeHonor It says they honor. score RM_bps_WeHonor 3.000 # 967s/3h of 74869 corpus # !! RM_BP -- Phrases used within body of spam (Miscellaneous) body RM_bp_CheckOnYour /Check up on your/i describe RM_bp_CheckOnYour suggests you Check up on your something score RM_bp_CheckOnYour 1.960 # 96s/0h of 74869 corpus body RM_bp_RiskFree /\%?100\%? risk free/i describe RM_bp_RiskFree suggests that something is 100% risk free score RM_bp_RiskFree 2.250 # 125s/0h of 74869 corpus body RM_bp_SelfEsteem /self esteem/i # From Emporium describe RM_bp_SelfEsteem Talks about self esteem score RM_bp_SelfEsteem 1.800 # 280s/2h of 39989 corpus; 269s/2h of 63143 corpus; 240s/2h of 74869 corpus ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk