> -----Original Message-----
> From: Bob Apthorpe [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, August 13, 2003 1:14 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [SAtalk] filter catching excel files
> 
> 
> Hi,
> 
> On 12 Aug 2003 15:50:12 -0700 Chris Bradfield 
> <[EMAIL PROTECTED]> wrote:
> 
> > So you want 50 points for any file with 
> > 
> > Content-Type: application/x-msexcel;
> >                              ^^^
> > ???
> > 
> > Looking through a mailbox full of mime-encoded attachments (and only
> > attachments) I found several occurrences of "sex" in the 
> encoded data.
> 
> `egrep -ci sex /usr/dict/words` yields 19 words including the 
> following
> non-gender, non-copulatory terms:
> 
> Essex
> Middlesex
> Sextans
> sextet
> sextillion
> sexton
> sextuple
> sextuplet
> Sussex
>  
> > That can't possibly be what you're after.  There's a real danger in
> > getting overzealous going after "bad" words.  Your HR department is
> > bound to get communications involving "sex"ual harassment, "sex"
> > discrimination, etc.
> 
> Learn from Prodigy's mistakes. If you're going to wander down that
> slippery path of filtering 'naughty' content, read and understand
> "Mastering Regular Expressions" (by Jeffery Friedl from 
> O'Reilly), read
> `perldoc perlre`, and score your homebrew rules at 0.01 until you're
> comfortable they rarely flag false positives, if ever. Better to test
> these out on your own account for a while before inflicting 
> them on your
> users. Better still to train Bayes to flag these. Improved 
> accuracy and
> less work for you.
> 
> > I think you really need to take a breath and think rationally about
> > these rules before you implement them.  You're taking an 
> extremely smart
> > content filtering system and turning it into a really dumb one.
> 
> Scunthorpe.
> 
> -- Bob
> 
> 


This just goes back to the same tip:

One rule should not make a huge difference!

Let several rules make the difference. 

Why score something at 50.0 if you mark it spam at 5.0 ?

--chris


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to