Now having just said that, I've realized that one thing Justin didn't give me access to (I don't think) is the corpus before it's been passed through mass-check! Hopefully you're still there Justin, an we can figure something out there.
C
On Wed, 2002-01-30 at 18:33, Greg Ward wrote:
On 31 January 2002, Olivier Nicole said: > I wonder if/how I should/could update the ponderations that are given > by the genetic algorithm. > > I know little about GA, bt I think I remember (some 12 or 15 years > ago) that it needed quite big samples. > > So I beleive I should keep all incoming messages, mark them as spam or > not spam and run GA on it. You don't run SpamAssassin's genetic algorithm -- I gather that only Justin Mason, the prime developer, does that currently. He has a big huge pile ("the corpus") of mail, spam and non-spam, that is used to feed the GA and generate the scores in everyone's /usr/share/spamassassin/*.cf files. Clever, eh? I'm sure it would be possible for everyone to have their own corpus of mail, and if Justin released the GA code (or has he already?) then we could all run the GA ourselves and come up with our own score sets. But why bother? Greg -- Greg Ward - software developer [EMAIL PROTECTED] MEMS Exchange http://www.mems-exchange.org _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk