Re: Calling Regex Experts

D . J . Thu, 24 Aug 2006 12:12:55 -0700

On 8/24/06, Bowie Bailey <[EMAIL PROTECTED]> wrote:

D.J. wrote:
> On 8/24/06, Bart Schaefer <[EMAIL PROTECTED]> wrote:
> > On 8/24/06, D. J. <[EMAIL PROTECTED] > wrote:
> > >
> > > I'm expecting these type of strings for sure:
> > >
> > > cat
> > > dog
> > > cat dog
> > > dog cat
> > >
> > > But I may get something like this too:
> > >
> > > cat cat dog
> > > dog dog
> > >
> > > Essentially I want it to match if anything other than cat or dog is
> > > in the string.
> >
> > That constraint means you have to construct a regex that can be
> > anchored at both beginning and end of string, e.g.
> > /\A(\s*(cat|dog)\s*)+\Z/.  I'm not sure that ever makes sense in the
> > context of a spamassassin rule, except maybe one matching against a
> > specific header.
>
> That's the idea... I've got the RELAY_COUNTRIES plugin that I want it
> to place a small score if the relay server is not in the US or
> Canada.  However, I'm not sure if the plugin will list the same
> country multiple times, which is where my uncertainty in the "cat cat
> dog" scenario came in.  So far my original rule ( !~ /cat|dog/) seems
> to be working well, but if I have a spammer smart enough to manage to
> bounce his spam originating in China off of somewhere in the US
> before it hits my MX, then that rule will fail.  Am I possibly too
> paranoid?

Ok.  Try this one:

   $value =~ /\b(?!cat\b|dog\b)\w+\b/i

This will match any word in the string as long as that word is not
"cat" or "dog".

--
Bowie

OK, we're actually really close. That actually matched everything I didn't want to match... we just have to get it to do the opposite of that. I have 6 test strings I tested against in a test script:

cat
dog
cat dog
dog cat
bird
cat bird

It matched the top four (incorrectly).

Re: Calling Regex Experts

Reply via email to