Re: regex matching conditionally

Jay Savage Tue, 18 Apr 2006 17:24:26 -0700

On 4/18/06, Jenda Krynicky <[EMAIL PROTECTED]> wrote:
> From: "Jay Savage" <[EMAIL PROTECTED]>
> > On 4/14/06, JupiterHost.Net <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > > Timothy Johnson wrote:
> > > > Will the string always have the two quotes in the same place and
> > > > only one per string?  What about something like this?
> > > >
> > > > /.*?\{([^\}]*)\}(?=.*")/gi
> > >
> > > I tested it out and it appears to be perect! Thank Mr. Johnson :)
> > >
> > > I love when I learn a new tidbit!
> >
> > A couple of things here.
> >
> > * You're not matching anything alphabetic here, so the "i" modifier is
> > superfluous.
> > * Using "^" in a class to limit a search is usually less
> > efficient than doing a 'non-greedy" search.
>
> Is it?
>
> I think it depends on the regexp and the data, but I'd expect the
> exact opposite.
>
> In this particular case you seem to be right, but if for example you
> wanted to skip the last character before the } the results are
> reverted.
>


[snip]

>
>
> Benchmark: timing 10000 iterations of withGroup, withNonGreedy...
>  withGroup:  1 wallclock secs ( 0.63 usr +  0.00 sys =  0.63 CPU) @
> 16000.00/s (n=10000)
> withNonGreedy:  0 wallclock secs ( 0.55 usr +  0.00 sys =  0.55 CPU)
> @ 18281.54/s (n=10000)
>
> and with the added . before the \}
>
> Benchmark: timing 10000 iterations of withGroup, withNonGreedy...
>  withGroup:  1 wallclock secs ( 0.77 usr +  0.00 sys =  0.77 CPU) @
> 13054.83/s (n=10000)
> withNonGreedy:  1 wallclock secs ( 1.03 usr +  0.00 sys =  1.03 CPU)
> @ 9699.32/s (n=10000)
>
> And actually if i add a few {groups} at the end of the script with no
> " following them the group becomes quicker as well.
>
> So, I'd say, if you don't care about the speed, use whichever makes
> most sense. Otherwise benchmark.
>

Hence "usually". You might also try using a greedy quantifier and a
negative look-ahead; you've designed a regex that by definition must
backtrack, which of course defeats some of the purpose of the '?'. Of
course it always depends on the data, and the longer, more random and
more complex the data, the more it depends. "Benchmark' is always good
advice.

But as a general rule of thumb, classes are expensive.

Also, since this is a beginner list, let's be clear about the
terminology groups (capturing and non-capturing) are formed with
parenthesis, and are used for logically separating chunks of regex.
square braces ('[]') are used for forming character classes.

-- j
--------------------------------------------------
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.dpguru.com  http://www.engatiki.org

values of β will give rise to dom!

Re: regex matching conditionally

Reply via email to