-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Theo Van Dinter writes:
> On Fri, Jul 15, 2005 at 01:40:03PM -0400, Matt Kettler wrote:
> > Those should both trip  HTML_FONT_SIZE_TINY.
> > Unfortunately, that's a low scoring rule due to some FPs and limited number 
> > of spam hits in the 3.0 corpus. The FPs may or may not be corpus pollution 
> > based. *shrug*
> 
> Legit senders use tiny fonts.  Looking at some of my rule FPs: CNET,
> Hersey's, L.L. Bean, Adidas, etc.

yep, in my corpus it's not pollution either.  legit senders certainly
write some spammy-looking HTML, and there's nothing we can do about
that :(  

> Based on my last run, it's actually more of a ham rule apparently:
> 
> OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
>   93579    82015    11564    0.876   0.00    0.00  (all messages)
> 100.000  87.6425  12.3575    0.876   0.00    0.00  (all messages as %)
>   0.093   0.0805   0.1816    0.307   0.33    0.00  HTML_FONT_SIZE_TINY
> 
> > The striping doesn't work with the font-size trick, as SA's body rules will 
> > see "VIwhateverAGRA"  for "VA<font size=0>whatever</font>AGRA".
> 
> FYI, There's a BZ ticket open about that.  Basically the code considers
> that stuff "invisible", but currently the body rules don't differentiate
> between visible vs invisible.

Bayes does, though, iirc.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFC2An1MJF5cimLx9ARAktjAJ91gwBO0QM4QAkSJgT1qmUeIsKcXgCgmDCh
v7WinIyPhTTVbgpDhEloMA4=
=/XyE
-----END PGP SIGNATURE-----

Reply via email to