> -----Original Message----- > From: Bob Apthorpe [mailto:[EMAIL PROTECTED] > Sent: Wednesday, August 13, 2003 1:14 AM > To: [EMAIL PROTECTED] > Subject: Re: [SAtalk] filter catching excel files > > > Hi, > > On 12 Aug 2003 15:50:12 -0700 Chris Bradfield > <[EMAIL PROTECTED]> wrote: > > > So you want 50 points for any file with > > > > Content-Type: application/x-msexcel; > > ^^^ > > ??? > > > > Looking through a mailbox full of mime-encoded attachments (and only > > attachments) I found several occurrences of "sex" in the > encoded data. > > `egrep -ci sex /usr/dict/words` yields 19 words including the > following > non-gender, non-copulatory terms: > > Essex > Middlesex > Sextans > sextet > sextillion > sexton > sextuple > sextuplet > Sussex > > > That can't possibly be what you're after. There's a real danger in > > getting overzealous going after "bad" words. Your HR department is > > bound to get communications involving "sex"ual harassment, "sex" > > discrimination, etc. > > Learn from Prodigy's mistakes. If you're going to wander down that > slippery path of filtering 'naughty' content, read and understand > "Mastering Regular Expressions" (by Jeffery Friedl from > O'Reilly), read > `perldoc perlre`, and score your homebrew rules at 0.01 until you're > comfortable they rarely flag false positives, if ever. Better to test > these out on your own account for a while before inflicting > them on your > users. Better still to train Bayes to flag these. Improved > accuracy and > less work for you. > > > I think you really need to take a breath and think rationally about > > these rules before you implement them. You're taking an > extremely smart > > content filtering system and turning it into a really dumb one. > > Scunthorpe. > > -- Bob > >
This just goes back to the same tip: One rule should not make a huge difference! Let several rules make the difference. Why score something at 50.0 if you mark it spam at 5.0 ? --chris ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk