On 4/18/06, Jenda Krynicky <[EMAIL PROTECTED]> wrote: > From: "Jay Savage" <[EMAIL PROTECTED]> > > On 4/14/06, JupiterHost.Net <[EMAIL PROTECTED]> wrote: > > > > > > > > > Timothy Johnson wrote: > > > > Will the string always have the two quotes in the same place and > > > > only one per string? What about something like this? > > > > > > > > /.*?\{([^\}]*)\}(?=.*")/gi > > > > > > I tested it out and it appears to be perect! Thank Mr. Johnson :) > > > > > > I love when I learn a new tidbit! > > > > A couple of things here. > > > > * You're not matching anything alphabetic here, so the "i" modifier is > > superfluous. > > * Using "^" in a class to limit a search is usually less > > efficient than doing a 'non-greedy" search. > > Is it? > > I think it depends on the regexp and the data, but I'd expect the > exact opposite. > > In this particular case you seem to be right, but if for example you > wanted to skip the last character before the } the results are > reverted. >
[snip] > > > Benchmark: timing 10000 iterations of withGroup, withNonGreedy... > withGroup: 1 wallclock secs ( 0.63 usr + 0.00 sys = 0.63 CPU) @ > 16000.00/s (n=10000) > withNonGreedy: 0 wallclock secs ( 0.55 usr + 0.00 sys = 0.55 CPU) > @ 18281.54/s (n=10000) > > and with the added . before the \} > > Benchmark: timing 10000 iterations of withGroup, withNonGreedy... > withGroup: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ > 13054.83/s (n=10000) > withNonGreedy: 1 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) > @ 9699.32/s (n=10000) > > And actually if i add a few {groups} at the end of the script with no > " following them the group becomes quicker as well. > > So, I'd say, if you don't care about the speed, use whichever makes > most sense. Otherwise benchmark. > Hence "usually". You might also try using a greedy quantifier and a negative look-ahead; you've designed a regex that by definition must backtrack, which of course defeats some of the purpose of the '?'. Of course it always depends on the data, and the longer, more random and more complex the data, the more it depends. "Benchmark' is always good advice. But as a general rule of thumb, classes are expensive. Also, since this is a beginner list, let's be clear about the terminology groups (capturing and non-capturing) are formed with parenthesis, and are used for logically separating chunks of regex. square braces ('[]') are used for forming character classes. -- j -------------------------------------------------- This email and attachment(s): [ ] blogable; [ x ] ask first; [ ] private and confidential daggerquill [at] gmail [dot] com http://www.tuaw.com http://www.dpguru.com http://www.engatiki.org values of β will give rise to dom!