I'm leaning toward doing some extra sanity stuff in 2.11 or 2.2 -- just having 
too much fun stirring up a lively discussion to have come out and said so 
earlier :)

On the other hand though, I think it's possible that justin is a little *too* 
conservative on letting the GA go crazy.

I think primarily what I'm planning on playing with for 2.11 or 2.2 is:

1. limit scores to -5..5, possibly narrower
2. Do some more robust analysis of the results of score-setting, in particular 
checking which rules are the ones which are most commonly being tripped by false 
positives and false-negatives.
3. Getting nonspam corpus submitters to manually verify every false positive 
from their submissions.

I may also add in to that doing some manual verification.  Should probably put 
it on record that I did spend about 4 or 5 days cleaning the corpus out, running 
the GA, manually tweaking, rerunning, tweaking, rerunning, etc.  And if there 
are any problems in the scoreset, it's Duncan's fault for forcing me to do a 
release so he could get in before the debian freeze :)

C

Justin Mason wrote:

> Date: Thu, 28 Feb 2002 06:23:42 -0000 (GMT)
> From: Justin Mason <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: [SAtalk] new, larger, GA scores
> 
> > I haven't installed 2.1, but I agree that the new scores are worrisome.
> > With large scores like this (positive or negative), very small
> > perturbations in input can cause wildly different results, which seems
> > undesirable.  I'd like to hear Justin's take on this, if he's not
> > incommunicado.
> 
> I -- personally -- prefer to limit score ranges quite tightly, and do some
> manual sanity checking of the GA scores, even if this damages the hit 
> percentages.
> 
> but at the mo' it's up to Craig, as I'm off doing other stuff ;)
> 
> --j.
> 
> 
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
> 
> 
> 


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to