Update of OBFU chr's rule. rawbody __FVGT_rb_ATTACHMENT /Content-Disposition: attachment/i body __FVGT_b_OBFU_J /j(b|c|f|g|w)/i body __FVGT_b_OBFU_OTHER /(vj|vk|xj|xk|yy|zf|zj)/i body __FVGT_b_OBFU_Q0 /(j|k|p|q|t|v|w|z)q/i body __FVGT_b_OBFU_Q1 /q(a|f|h|j|k|m|n|s|y)/i body __FVGT_b_OBFU_V /(f|g|q|w)v/i body __FVGT_b_OBFU_X /(c|g|j|k|q|s|v|z)x/i body __FVGT_b_OBFU_Z /(f|j|k|p|q|x)z/i meta FVGT_m_MULTI_ODD ((__FVGT_b_OBFU_J + __FVGT_b_OBFU_OTHER + __FVGT_b_OBFU_Q0 + __FVGT_b_OBFU_Q1 + __FVGT_b_OBFU_V + __FVGT_b_OBFU_X + __FVGT_b_OBFU_Z && !__FVGT_rb_ATTACHMENT) > 1) describe FVGT_m_MULTI_ODD FVGT - contains multiple odd letter combinations score FVGT_m_MULTI_ODD 1.4
This one is less likely to cause false positves when a message contains a double-forwarded attachment. That's the only issues I've seen here. Is this rule syntax legal? I didn't know I could combine an additive rule with a ! and have it all work ;) Frederic Tarasevicius Internet Information Services, Inc. http://www.i-is.com/ Daniel Quinlan wrote: > "Fred I-IS.COM" <[EMAIL PROTECTED]> writes: > >> I created a list which might be helpful, using a dictionary I >> searched for letter pairs which did not exist. I created the >> following meta rule to search for these non-existant pairs, it might >> do just what you are looking for. > > Your meta rule seems to work pretty well. > > Some issues that might need to be worked out: > > - getting it to work in an internationalized fashion, we could just > write a rule to be used when the message specifies that it is > English, when "ok_languages en" is set, or something like that, > but that is non-optimal > > - false positives are still a bit high: > - PGP signatures > - some "legitimate" URLs (Network Solutions unsubscribe URL for > renewal notices) > > Another thing that might work well is instead using an eval test that > counts non-existent pairs. There are also the triplets and N-gram > files > used by the language testing in TextCat.pm -- we could test N-gram > frequency and if the advertized language is well off the language > model > for that language, then score a hit. > > Some quick results: > > OVERALL% SPAM% HAM% S/O RANK SCORE NAME > 9810 4814 4996 0.491 0.00 0.00 (all messages) > 100.000 49.0724 50.9276 0.491 0.00 0.00 (all messages as %) > 5.902 11.8612 0.1601 0.987 0.90 1.00 > T_FVGT_M_MULTI_ODD_3 > 9.521 19.0278 0.3603 0.981 0.89 1.00 > T_FVGT_M_MULTI_ODD_2 > 15.821 30.1413 2.0216 0.937 0.80 1.00 > T_FVGT_M_MULTI_ODD_1 > > slightly revised rule definitions: > > ------- start of cut text -------------- > # Frederic Tarasevicius > # Internet Information Services, Inc. > # From: "Fred I-IS.COM" <[EMAIL PROTECTED]> > # Message-ID: <[EMAIL PROTECTED]> > # Subject: Re: [SAtalk] Consonant and Vowel Pairs or Sequences > # To: <[EMAIL PROTECTED]> > # Date: Mon, 13 Oct 2003 17:13:31 -0400 > > body __OBFU_J /j[bcfgw]/i > body __OBFU_OTHER /(?:vj|vk|xj|xk|yy|zf|zj)/i > body __OBFU_Q0 /[jkpqtvwz]q/i > body __OBFU_Q1 /q[afhjkmnsy]/i > body __OBFU_V /[fgqw]v/i > body __OBFU_X /[cgjkqsvz]x/i > body __OBFU_Z /[fjkpqx]z/i > meta T_FVGT_M_MULTI_ODD_1 ((__OBFU_J + __OBFU_OTHER + __OBFU_Q0 + > __OBFU_Q1 + __OBFU_V + __OBFU_X + __OBFU_Z) > 1) meta > T_FVGT_M_MULTI_ODD_2 ((__OBFU_J + __OBFU_OTHER + __OBFU_Q0 + > __OBFU_Q1 + __OBFU_V + __OBFU_X + __OBFU_Z) > 2) meta > T_FVGT_M_MULTI_ODD_3 ((__OBFU_J + __OBFU_OTHER + __OBFU_Q0 + > __OBFU_Q1 + __OBFU_V + __OBFU_X + __OBFU_Z) > 3) ------- end > ---------------------------- > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > SourceForge.net hosts over 70,000 Open Source Projects. > See the people who have HELPED US provide better services: > Click here: http://sourceforge.net/supporters.php > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk ------------------------------------------------------- This SF.net email is sponsored by OSDN developer relations Here's your chance to show off your extensive product knowledge We want to know what you know. Tell us and you have a chance to win $100 http://www.zoomerang.com/survey.zgi?HRPT1X3RYQNC5V4MLNSV3E54 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk