-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Theo Van Dinter writes: > On Fri, Jul 15, 2005 at 01:40:03PM -0400, Matt Kettler wrote: > > Those should both trip HTML_FONT_SIZE_TINY. > > Unfortunately, that's a low scoring rule due to some FPs and limited number > > of spam hits in the 3.0 corpus. The FPs may or may not be corpus pollution > > based. *shrug* > > Legit senders use tiny fonts. Looking at some of my rule FPs: CNET, > Hersey's, L.L. Bean, Adidas, etc. yep, in my corpus it's not pollution either. legit senders certainly write some spammy-looking HTML, and there's nothing we can do about that :( > Based on my last run, it's actually more of a ham rule apparently: > > OVERALL% SPAM% HAM% S/O RANK SCORE NAME > 93579 82015 11564 0.876 0.00 0.00 (all messages) > 100.000 87.6425 12.3575 0.876 0.00 0.00 (all messages as %) > 0.093 0.0805 0.1816 0.307 0.33 0.00 HTML_FONT_SIZE_TINY > > > The striping doesn't work with the font-size trick, as SA's body rules will > > see "VIwhateverAGRA" for "VA<font size=0>whatever</font>AGRA". > > FYI, There's a BZ ticket open about that. Basically the code considers > that stuff "invisible", but currently the body rules don't differentiate > between visible vs invisible. Bayes does, though, iirc. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFC2An1MJF5cimLx9ARAktjAJ91gwBO0QM4QAkSJgT1qmUeIsKcXgCgmDCh v7WinIyPhTTVbgpDhEloMA4= =/XyE -----END PGP SIGNATURE-----