Re: *****SPAM***** [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Olivier Nicole
Hi Bart, >When I installed SA on my ISP's mailserver, I also set up a cron job to >mail me a condensed report of the spams it had diverted. I had to put a >special rule in .procmailrc to avoid invoking SA on the spam report, as >I found that a large number of SA's rules will match their own nam

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Rob McMillin
Michael Moncur wrote: >>body CORRECT_FOR_EXCHANGE /This message is in MIME format/ >>describe CORRECT_FOR_EXCHANGE Correct for MIME 'null block' >> > >FYI, I seem to recall SA already having a test like this. You might want to >double-check. > Yes, it's called MIME_NULL_BLOCK. (I'm lookin

Re: *****SPAM***** [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Bart Schaefer
On Thu, 28 Feb 2002, Gunter Ohrner wrote: > Am Thursday, 28. February 2002 00:39 schrieb Bart Schaefer: > > SPAM: Hit! (6.5 points) BODY: Link to a URL containing "remove" > > Were did Bart's message hit the test? That's certainly a false positive. :-) The A_HREF_TO_REMOVE rule matched the lit

RE: [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Michael Moncur
> To me, -ve scores on tests can also be used to "offset" spammy messages in > clean email. I have several of these of my own creation: Well, yes, that's true - SpamAssassin already includes a bunch of these, such as COPYRIGHT_CLAIMED and PHP_SIGNATURE. What I was talking about was the fact that

RE: [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Andrew Kohlsmith
> I know there are theoretical reasons why this might make sense, but I don't > see any benefit in the real world for scores like these. The high scores > increase the chance of a random false positive - regardless of the size of > the existing corpus - and if the negative ones indicate that the r

Re: *****SPAM***** [SAtalk] Troubling new scores in 2.1 release

2002-02-28 Thread Gunter Ohrner
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi! Interesing: Am Thursday, 28. February 2002 00:39 schrieb Bart Schaefer: > SPAM: Start SpamAssassin results > -- SPAM: Diese eMail enthält höchstwahrscheinlich > unerwünschte Werbung (SPAM). SPAM: Die eMai

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Andrew Kohlsmith
> SPAM: Hit! (4.9 points) BODY: URL of page called "remove" > SPAM: Hit! (6.5 points) BODY: Link to a URL containing "remove" No, not impressive. Those two scores would put a whole lot of honest opt-in web "flyers" and likely many mailing lists in the spam bucket. I'm strongly opposed to any

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Daniel Rogers
On Wed, Feb 27, 2002 at 05:15:20PM -0800, Craig R Hughes wrote: > I meant single score, but yet, that message is pretty impressive. I assume it > was not a false-positive :) Uh, yeah, it was real spam. :) I just found a 47.1 hits one, even though it had two -ve scores (HTTP_USERNAME_USED and

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Craig R Hughes
CTED]>, > [EMAIL PROTECTED] > Subject: Re: [SAtalk] Troubling new scores in 2.1 release > > On Wed, Feb 27, 2002 at 05:00:29PM -0800, Craig R Hughes wrote: > > Yes, the large rule scores probably do make the system more sensitive to minor > > variations in input. How

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Daniel Rogers
On Wed, Feb 27, 2002 at 05:00:29PM -0800, Craig R Hughes wrote: > Yes, the large rule scores probably do make the system more sensitive to minor > variations in input. However, they also apparently lead to more accurate > scores. It is interesting that even running unconstrained over 50,000 >

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Craig R Hughes
fer <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: Re: [SAtalk] Troubling new scores in 2.1 release > > On Wed, 27 Feb 2002, Craig R Hughes wrote: > > > This isn't really a problem. It can actually be helpful too to allow > >

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Bart Schaefer
On Wed, 27 Feb 2002, Craig R Hughes wrote: > This isn't really a problem. It can actually be helpful too to allow > the GA to do its own thing [...] On Wed, 27 Feb 2002, Tom Lipkis wrote: > With large scores like this (positive or negative), very small > perturbations in input can cause wildly

Re: [SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Craig R Hughes
ert. I think you'll probably end up with worse results though. C Bart Schaefer wrote: > Date: Wed, 27 Feb 2002 15:39:15 -0800 (PST) > From: Bart Schaefer <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: [SAtalk] Troubling new scores in 2.1 release > > I'

[SAtalk] Troubling new scores in 2.1 release

2002-02-27 Thread Bart Schaefer
I've diffed the r1.37 and r1.38 rules/50_scores.cf and some of the changes are so unbelievable that I've decided not to install the new scores file. Here's just a sampling: r1.37 r1.38 ---- score 25FREEMEGS_U