Paul R. Ganci wrote: > I am using DCC-1.3.23 to do greylisting (dccd greylisting server with > the dccm sendmail milter). As a result DCC checksums are performed prior > to the Spamassassin 3.1.0 scan. Therefore rather than repeat the DCC > checks in Spamassassin I have constructed a custom ruleset to perform > the check off of the X-DCC header added by DCC. Unfortunately the > actually header is of the form X-DCC-*-Metrics: where the "*" can be one > of a myriad of server names (e.g. X-DCC-EATSERVER-Metrics:, > X-DCC-NIET-Metrics:, etc.). As I can not know all the possible public > DCC servers I can not just enumerate the servers by listing specific > rules for each unique header. The best I could come up with is a generic > header rule: > > header X_DCC_SCORE ALL =~ /^.*bulk Body=/s > describe X_DCC_SCORE DCC bulk score indicates spam > score X_DCC_SCORE 4.0
Regex suggestion: the ^.* at the beginning is pointless. Why force-match the start of a line, but allow any number of any character immediately following? /bulk Body=/s will have the exact same matches, and do it faster with less memory. For optimization sake, perhaps this would be better: header X_DCC_SCORE ALL =~ /^X-DCC-.{1,64}-Metrics:.{1,100}bulk Body=/s At least you're going to have a "fast out" of most header lines by force-matching X-DCC as the first 5 characters. Note that when you use ALL, the actual header names are in the text so you can match on the header name itself. Basically ALL dictates you get the whole header block, complete with header names and everything. Any other header name will only give you the text of the header, sans header name. After all, it would be redundant to do something like "Received =~/^Received:/i".