On Thu, 23 Dec 2004, Matt Kettler wrote:
> At 12:06 PM 12/23/2004, John wrote: > >Matt, > >I appreciate this info! Is there a place where I can go to find more about > >how this all works? > > Not that I'm aware of. There's some bits of information in the wiki, but > there's no "one general source" of information... > > http://wiki.apache.org/spamassassin/HowScoresAreAssigned?action=highlight&value=perceptron > > > Probably the best block of information is in the readme for the perceptron > itself: > > http://spamassassin.apache.org/full/3.0.x/dist/masses/README.perceptron > > The rest of it tends to come from having a feel for statistics and how > statistical systems operate, and a little bit of watching the devs discuss > concepts over the years. > Matt, I checked out these sources earlier but as you say it seems like just bits and pieces. I guess there is no place that has all that is need to generate scores. I am currently testing some of the rulesets found in rules_du_jour and I am running into a problem. These rules hit my spam corpus and when I run mass-check the spam.log file shows entries for these rules but when I run the perceptron the perceptron.scores has no entries for these custom rules. Also when I run hit-frequencies none of these rules show up. I place the custom rules in Mail-SpamAssassin-3.0.1/masses/spamassassin/local.cf. I also placed them in Mail-SpamAssassin-3.0.1/rules/local.cf. It seems that mass-check finds them but perceptron doesn't. http://wiki.apache.org/spamassassin/MassCheck suggests running mass-check on custom rules but it doesn't really describe where these rules should be placed. In going through the code of hits-frequencies (it also can not find these rules) I noticed that it calls parse-rules-for-masses which apperently only checks the rules directory for "[0-9]*.cf". I haven't looked through perceptron.c yet. But before I start investing a lot of time I was wondering if you have run into this problem. One more thing, I run hits-frequencies and perceptron without any options that change the rules directory so they should be running with the default of ../rules. Thanks, John