Linda Walsh wrote: > I have been receiving a spate of short messages that don't seem > > to trigger enough default rules to be knocked out. I was > investigating and noticed a discrepancy [bug?] in the rules. > > One particular email refers to the uniquely Male-Body-Part starting > w/"P", let's call MBP for purposes discussion.
> > > It gets hit by a '20' rule for body parts in the message body, > but I noticed it doesn't get anything for the subject: Yes it does.. the text of the subject line will match against any body rule. SA pre-pends this so we don't have to have a massive duplication of rules to cover both body and subject. > "Want a Bigger MBP?" A '25_replace' rule is present for "fuzzy" > MBP's, but doesn't seem to catch unfuzzy ones. > So I guess questions might be: > 1) should 'fuzzy' rules match non-fuzzy targets as well > as fuzzy ones? IMHO, no. I think there should be two rules with separate scores. In the above example the scores would be pretty much the same. However consider the word viagra, an obfuscation is a clear sign of spam. Un-obfuscated is a less strong sign of spam in this case, because it could be a joke or a conversation with a medical discussion of some form. > 2) Should there be some "normalization" adjustment for > short messages? > I'm thinking a "scale factor" rather than an absolute score > to add, -- reflecting the general idea that short messages > are not bad, but if you are scoring on the "bad" side, a > multiplier (ex. 1.1 or 1.2) would increase the score of a message > that is already being sized up as "bad". > > Does SA support any multiplier type rules? No. > Should it, or > rather, do people feel this is a good idea? I don't feel that would be a good idea. Bear in mind this would also make a "good" message (ie: one at -1.0) be "more good". It just doesn't make sense to me to have something which merely acts as a "score amplifier" instead of a score adjustment. Performing any kind of GA to establish a reasonable multiplier value for these would be a logistical nightmare. You also get into an issue of order-of-operations. Does this multiplier apply to the current score as of the momet the rule hits? or after the total message score is calculated do you make a second pass and factor in all the multipliers, taking a slight performance hit for the extra calculation run?