Ok, well I now understand how the mass check and GA work.

What I would like to do is play around with removing some rules and then
re-running the GA to get new scores/statistics and see how they compare.  I
know that I can just set the scores for these rules to zero, but
theoretically re-generating the scores without those rules should produce
much better results.

So I'd like to get access the the mass check logs that were used to generate
the scores in the current release (so that I can re-run the GA and compare
my results to make sure I did it right).  Assuming that I get that far, I'll
pull all the rules I don't like from the logs, run the GA again, then
compare the statistics.

I read through the stuff in "masses", but I don't see a way to access the
mass check result logs.  Are they public, and if so, how can I access them?

Bob Dickinson

----- Original Message ----- 
From: "Theo Van Dinter" <[EMAIL PROTECTED]>
To: "Bob Dickinson (BSL)" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Thursday, February 20, 2003 8:30 PM
Subject: Re: [SAtalk] The "Big" Corpus

On Thu, Feb 20, 2003 at 07:02:15PM -0800, Bob Dickinson (BSL) wrote:
> I'm not entirely clear on whether this means what I think it means, but I
> would very much like to get the "big" corpus to test against (the roughly
> 250,000 messages that seem to be used to generate the mass-check results
in
> the distro).  I've been using the SpamAssassin public corpus, but I'd
really
> like something bigger.  Is this possible, and if so, how do I get access?

I don't know about making corpii available (although Justin did make
some messages available), but the mass-check results used are available.
You just need to get an rsync account and then transfer the logs to your
machine. :) There's info in the masses directory about rsync.

The mass-check results are generated by many different individuals'
corpii, and their personal mails aren't going to be available (not
surprising).

-- 
Randomly Generated Tagline:
HELP! It's dark in here...oh, my eyes were closed; sorry.



-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines
at the same time. Free trial click here:http://www.vmware.com/wl/offer/358/0
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to