RE: [SAtalk] Bayes mis-learning problem

Larry Gilson Mon, 19 Jan 2004 22:15:26 -0800

Thanks for clarifying Justin!

--Larry


> -----Original Message-----
> From: [EMAIL PROTECTED]
> Sent: Monday, January 19, 2004 11:35 PM
> To: Larry Gilson
> Cc: 'Ross Vandegrift'; [EMAIL PROTECTED]
> Subject: Re: [SAtalk] Bayes mis-learning problem


> Larry Gilson writes:
> > > In a broader sense though, shouldn't fields like To: be excluded by
> > > default?  It seems like if I receive more than 50% spam, this is a
> > > receipe for disaster.  Of course, some spam won't have a valid To:
> > > field, but it seems like constant things like this will be very bad
> > > arbitors.
> >
> > That seems like a reasonable assumption in that specific case.  However,
it
> > may not be a good assumption with the large audience that SA serves.
Some
> > people get more spam than ham, others get more ham than spam, and yet
others
> > get roughly even amounts.  It sounds like your experience is more on the
> > extreme upper-end of spam/ham ratio.  I would think that at either end
of
> > the spectrum though, the To: field is not a good indicator of either
spam or
> > ham regardless of the numbers.
> 
> Actually, it works quite well.  Some people get more spam than ham to
> specific To addrs, so those become spam signs -- but once a ham arrives
> at those addrs, the ham signs outweigh the To spam-sign and redeem
> the mail.
> 
> At least, that's how it worked out in our testing; initially we did
> not tokenise these headers, but in testing, we found that they did
> increase accuracy.
> 
> - --j.



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

RE: [SAtalk] Bayes mis-learning problem

Reply via email to