On Fri, 2016-06-10 at 18:26 -0400, Bill Cole wrote: > It will be interesting to see the stats on scantimes this week to see > if my tightening up on sloppy rules has an impact. I expect it will, > since I now have a concrete theory to explain that long tail out to 2 > minutes, which before now I've ignored as pure noise. > Thanks for the heads-up.
It prodded me into scanning my local rule set for unguarded '.*' in local rules. I found a few - more or less what I expected and virtually all in my older rules - and limited them all to runs of up to 32 characters before running my spam corpus against them. This showed only two instances of a rule now failing to fire. Finding the rule and changing it to allow up to 64 characters to match in the 'don't care' parts of the regex fixed that. FWIW this is a rule that looks for two consecutive URLs in body text. I knew that some of these can be quite long, but since, in all the spam I've inspected they were used as a more or less neat final line in the message, they probably don't exceed 32 chars by much. So yes, while I know that 64 chars is probably overkill, its not so much overkill that its worth further fiddling. Martin