On Sun, May 26, 2002 at 10:55:34PM +0200, Tony L. Svanstrom wrote:
> On Sun, 26 May 2002 the voices made Bart Schaefer write:
> 
> > On Sun, 26 May 2002, Jason Rimmer wrote:
> >
> > >      Why is that?  It would appear to me that a web_bug is an excellent
> > > indicator, and a nasty one at that, of spam.
> >
> > A web bug is an indicator of a commercial site that is trying to track
> > your activity.  However, not all commercial email is unsolicited, and web
> > bugs are increasingly in use by legitimate [*] commercial sites.  Also,
> > the web bug criteria is a bit loose -- any image URI with query parameters
> > is a match, not just those with some kind of ID tag as implied by the
> > description.
> 
>  Should rules, clearly involving nasty things used by spammers, be removed when
> the scores go negative?

I think so. Rules designed to catch spam, scored negatively, even if
they occur more frequently in non-spam than spam, are NOT good
indicators of spam. They are merely bad/false indicators of spam, and
the regexp's should be changes to make them better spam indicators.

If we want to have negative scoring rules, we should try to put
together regexp's that are actually non-spam indicators. The
DEAR_SOMEBODY rules is a perfect example. "Dear Sir/Madam" is a sign
of spam, "Dear Duncan" is not. I think we should add:

body DEAR_SIR /Dear (?:Sir|Madam|IT\b|friend\W|Internet)/i
describe DEAR_SIR       How dear am I? You don't know my name!

body DEAR_EMAIL /Dear [A-Za-z0-9_-]+\@/
describe DEAR_EMAIL     How dear am I? You call me by my e-mail address!

This would significantly lower the score for DEAR_SOMEBODY and give
high scores for DEAR_SIR and DEAR_EMAIL if I'm not mistaken. I've just
filed bug 352 for this matter.

>  My thinking is that when it involves such a bad thing as tracking the user it
> might be better to allow other rules to do a better job (giving the GA-based
> scores a clearer black/white-situation).
>  Besides, whitelistning is an important part of all filtering...

One of spamassassin's great features is it requires more than one
thing to classify something as spam.

-- 
Duncan Findlay

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to