D.J. wrote: > On 8/24/06, Bowie Bailey <[EMAIL PROTECTED]> wrote: > > D.J. wrote: > > > On 8/24/06, Bart Schaefer <[EMAIL PROTECTED]> wrote: > > > > On 8/24/06, D. J. <[EMAIL PROTECTED] > wrote: > > > > > > > > > > I'm expecting these type of strings for sure: > > > > > > > > > > cat > > > > > dog > > > > > cat dog > > > > > dog cat > > > > > > > > > > But I may get something like this too: > > > > > > > > > > cat cat dog > > > > > dog dog > > > > > > > > > > Essentially I want it to match if anything other than cat or > > > > > dog is in the string. > > > > > > > > That constraint means you have to construct a regex that can be > > > > anchored at both beginning and end of string, e.g. > > > > /\A(\s*(cat|dog)\s*)+\Z/. I'm not sure that ever makes sense in > > > > the context of a spamassassin rule, except maybe one matching > > > > against a specific header. > > > > > > That's the idea... I've got the RELAY_COUNTRIES plugin that I want > > > it to place a small score if the relay server is not in the US or > > > Canada. However, I'm not sure if the plugin will list the same > > > country multiple times, which is where my uncertainty in the "cat > > > cat dog" scenario came in. So far my original rule ( !~ /cat|dog/) > > > seems to be working well, but if I have a spammer smart enough to > > > manage to bounce his spam originating in China off of somewhere in > > > the US before it hits my MX, then that rule will fail. Am I > > > possibly too paranoid? > > > > Ok. Try this one: > > > > $value =~ /\b(?!cat\b|dog\b)\w+\b/i > > > > This will match any word in the string as long as that word is not > > "cat" or "dog". > > OK, we're actually really close. That actually matched everything I > didn't want to match... we just have to get it to do the opposite of > that. I have 6 test strings I tested against in a test script: > > cat > dog > cat dog > dog cat > bird > cat bird > > It matched the top four (incorrectly).
Are you sure you used it correctly? This is a positive match (=~), not a negative match (!~). Test program: @strings = ( "cat", "dog", "cat dog", "dog cat", "bird", "cat bird", "caterwaul" ); for $str (@strings) { if ($str =~ /\b(?!cat\b|dog\b)\w+\b/i) { print "$str -- MATCHED\n"; } else { print "$str -- no match\n"; } } Output: cat -- no match dog -- no match cat dog -- no match dog cat -- no match bird -- MATCHED cat bird -- MATCHED caterwaul -- MATCHED -- Bowie