At 02:54 AM 10/23/2003, =?big5?B?WXVhbi1DaHVuZyBIc2lhb1wov72yV8TBXCk=?= wrote:
        Now I research Spamassassin's Genetic Algorithms, but I don't
understand
Spamassassin work with GA and bayes classfilter.
        Does anybody know?

The Genetic Algorithm is not a part of SpamAssassin itself. The Genetic Algorithm is an external program used by the developers to assign scores to all the rules prior to release.


Basically the GA is given three things as input.
1) a list of rules
2) the match results of those rules against a large sample of non spam email.
3) the match results of those rules against a large sample of spam email.


The GA then starts assigning scores to all the rules, and iteratively adjusts them to try to place as many emails in the correct piles as possible. The results of this can be seen in the STATISTICS.txt files that are included with SpamAssassin.

From the perspective of the GA, the bayes subsystem is just a grouping of rules. Each bayes rule represents a range of bayes probabilities. For example BAYES_30 represents a range of bayes probabilities between 30 and 40. Since they are just rules to the GA, it assigns them scores the same way it does other rules.

If you need more information on the GA and related tools, you can look in the masses subdirectory of the spamassassin tarball.
mass-check is the tool used to generate the match results for 2) and 3) above
craig-evolve.c is the source code for the GA score evolver.
runGA is a script that helps automate the process of running the GA






-------------------------------------------------------
This SF.net email is sponsored by: The SF.net Donation Program.
Do you like what SourceForge.net is doing for the Open
Source Community?  Make a contribution, and help us add new
features and functionality. Click here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to