On Wed, Feb 27, 2002 at 03:30:32AM -0700, Michael Moncur wrote:
> I might be wrong, but I think there's something seriously amiss with the new
> GA-evolved scores - they don't seem to have an upper boundary (many are 9-10 or
> so) or a lower (some are negative). Some examples that can't be right:
> 
> score 25FREEMEGS_URL                 -4.606
> score BE_AMAZED                      -4.581
> score CASHCASHCASH                   -3.700
> score CYBER_FIRE_POWER               -4.020
> score DEAR_SOMEBODY                  -4.412
> score EXCUSE_5                       13.447
> score FREE_CONSULTATION              15.263
> score GAPPY_TEXT                     -3.667
> score IN_REP_TO                      -13.472
> score MONSTERHUT                     -8.280
> score ONCE_IN_LIFETIME               -4.604
> score PORN_8                         -5.452
> score TRACKER_ID                     -4.899
> 
> Aside from the boundary issue, is there perhaps something odd about the corpus?
> It doesn't include messages from this list, does it? I would think there would
> be no trouble calculating a positive score for something like "Monsterhut"...
> 

Ummm... I'd be heavily inclined to set these spam scores to 0.01. It's not
that I don't trust the GA, it's just that if these are the outputs, they
aren't needed in the first place.

-- 
Duncan Findlay

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to