Alex Woick wrote: > ...very nice analysis of rule trimmed... Thank you very much for taking the time to look so closely at that rule. I still think it is not behaving as it was originally intended and as such is scoring too heavily. I filed a bug on this issue so that it would not get lost.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5716 > Since these rules were assigned such a high score, only very few ham > from the score-generating corpus (if any) seem to contain this > misspelling. Very likely the case. I think the typical email has mostly correctly spelled normal words with a splatter of text strings that are not in any dictionary. > If I understand this process correctly, the scores are not manually > determined but by a lengthy automatic analysis process for a big > message corpus that tries to minimize scores for known ham and > maximize scores for known spam as a whole. Correct. It is machine scored. http://wiki.apache.org/spamassassin/HowScoresAreAssigned > What you can do: > - lower the score for these rules manually Already done. I reduced those to 0.5 each so that the combined score for a single mispelling would be only 1.0 points. > - and perhaps give the SA developers your FP to include it into their > corpus. Sure. But this is also very easily created on the fly as well. Thanks Bob