D.J. wrote:
> On 8/24/06, Bowie Bailey <[EMAIL PROTECTED]> wrote:
> > D.J. wrote:
> > > On 8/24/06, Bart Schaefer <[EMAIL PROTECTED]> wrote:
> > > > On 8/24/06, D. J. <[EMAIL PROTECTED] > wrote:
> > > > > 
> > > > > I'm expecting these type of strings for sure:
> > > > > 
> > > > > cat
> > > > > dog
> > > > > cat dog
> > > > > dog cat
> > > > > 
> > > > > But I may get something like this too:
> > > > > 
> > > > > cat cat dog
> > > > > dog dog
> > > > > 
> > > > > Essentially I want it to match if anything other than cat or
> > > > > dog is in the string.
> > > > 
> > > > That constraint means you have to construct a regex that can be
> > > > anchored at both beginning and end of string, e.g.
> > > > /\A(\s*(cat|dog)\s*)+\Z/.  I'm not sure that ever makes sense in
> > > > the context of a spamassassin rule, except maybe one matching
> > > > against a specific header.
> > > 
> > > That's the idea... I've got the RELAY_COUNTRIES plugin that I want
> > > it to place a small score if the relay server is not in the US or
> > > Canada.  However, I'm not sure if the plugin will list the same
> > > country multiple times, which is where my uncertainty in the "cat
> > > cat dog" scenario came in.  So far my original rule ( !~ /cat|dog/)
> > > seems to be working well, but if I have a spammer smart enough to
> > > manage to bounce his spam originating in China off of somewhere in
> > > the US before it hits my MX, then that rule will fail.  Am I
> > > possibly too paranoid?
> > 
> > Ok.  Try this one:
> > 
> >    $value =~ /\b(?!cat\b|dog\b)\w+\b/i
> > 
> > This will match any word in the string as long as that word is not
> > "cat" or "dog".
> 
> OK, we're actually really close.  That actually matched everything I
> didn't want to match... we just have to get it to do the opposite of
> that.  I have 6 test strings I tested against in a test script:  
> 
> cat
> dog
> cat dog
> dog cat
> bird
> cat bird
> 
> It matched the top four (incorrectly).

Are you sure you used it correctly?  This is a positive match (=~), not a
negative match (!~).

Test program:
    @strings = ( "cat", "dog", "cat dog", "dog cat", "bird",
                 "cat bird", "caterwaul" );
    for $str (@strings) {
        if ($str =~ /\b(?!cat\b|dog\b)\w+\b/i) {
            print "$str -- MATCHED\n";
        }
        else {
            print "$str -- no match\n";
        }
    }

Output:
    cat -- no match
    dog -- no match
    cat dog -- no match
    dog cat -- no match
    bird -- MATCHED
    cat bird -- MATCHED
    caterwaul -- MATCHED

-- 
Bowie

Reply via email to