I got a bit of curiousity in my brain about neural networks, and someone suggested I take a look at how SpamAssassin trains itself. I have been looking into .../masses and come across some things which set off warning bells. I don't think I have actually found any bugs, but it isn't clear to me what is going on, there are some unused variables, and I pathetically justify my intrusion on your time with the thought that there *might* be a bug ... :-)
The code generated in tmp/scores.h by logs-to-c includes these three variables: ny_hit[$num_mutable] yn_hit[$num_mutable] lookup[$num_mutable] which appear to never be used by either perceptron.c or any generated code. It also looks like $num_mutable has almost no use; besides setting the size of these unused arrays, it governs the weight decay loop, which looks to be bypassed under default conditions. A bit more poking shows that num_scores in perceptron.c, set from $size in logs-to-c, is used for all other array sizes, including the weights, and for all related loops, including scaling and printing the weights. What puzzles me is the print loop at the end of write_weights(): for (i = 0; i < num_scores; i++) { if ( is_mutable[i] ) { fprintf(fp, "score %-30s %2.3f # [%2.3f..%2.3f]\n", score_names[i], weight_to_score(weights[i]), range_lo[i], range_hi[i]); } else { fprintf(fp, "score %-30s %2.3f # not mutable\n", score_names[i], range_lo[i]); } } The weight decay loop operates only on the first num_mutable entries of the weights array, implying that it, and presumably all other arrays sized by num_scores, are set up with mutable scores first, followed by non-mutable scores. Thus this loop could be rewritten like this: for (i = 0; i < num_scores; i++) { if ( i < num_mutable ) { fprintf(fp, "score %-30s %2.3f # [%2.3f..%2.3f]\n", score_names[i], weight_to_score(weights[i]), range_lo[i], range_hi[i]); } else { fprintf(fp, "score %-30s %2.3f # not mutable\n", score_names[i], range_lo[i]); } } or even like this: for (i = 0; i < num_mutable; i++) { fprintf(fp, "score %-30s %2.3f # [%2.3f..%2.3f]\n", score_names[i], weight_to_score(weights[i]), range_lo[i], range_hi[i]); } for (; i < num_scores; i++) { fprintf(fp, "score %-30s %2.3f # not mutable\n", score_names[i], range_lo[i]); } Is this right? I have been doing so much Perl recently that C is beginning to look funny, like reading Mark Twain after too much Charles Dickens. What I am really trying to do is understood the neural network part of SpamAssassin and I seem to have gotten sidetracked, as with all fun projects :-) I have gotten hung up on what mutable means for the code in .../masses/, and it does not seem particularly clear yet. -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman & rocket surgeon / [EMAIL PROTECTED] GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o