Thanks Andrew. I was also of the same view that perl handled this via some special cases.
On Wed, Apr 1, 2009 at 8:32 PM, andrew cooke <and...@acooke.org> wrote: > > more exactly, my guess is perl has a special case for this that avoids > doing a search over all possible matchers via the pushdown stack. > > andrew cooke wrote: > > > > ".*?" is a "not greedy" match, which is significantly more difficult to > > handle than a normal ".*". so the performance will depend on quite > > complex details of how the regular expression engine is implemented. it > > wouldn't surprise me if perl was better here, because it comes from a > > background with a much stronger emphasis on regular expressions. > > > > i know that not an exact answer, but unless you find the author of the re > > library i am not sure you will get a much better explanation. it comes > > down to whether the regular expression can be implemented using a > > deterministic finite automaton (rather than an indeterministic one). if > > you look at the table at the bottom of > > http://en.wikipedia.org/wiki/Finite_state_machine then i believe (i am > not > > 100% sure) that the ".*?" match requires at least a pushdown automota, > > while ".*" can be handled with a simple finite automaton. > > > > disclaimer: this is all fairly new to me as i just recently implemented a > > regular expression matcher myself, and i may be wrong on some of the > > details. > > > > andrew > > > > > > akshat agarwal wrote: > >> Hi, > >> > >> I am trying to use the following snippet of code to print a regex match. > >> > >> s = '01234567890123456789x012' > >> > >> pat = r'(.*?x|[^a]+)*y' > >> > >> mo = re.search(pat, s) > >> if mo is not None: > >> print mo.group(0) > >> > >> By adding one character before the 'x' in the input string, the time > >> taken > >> to print the match doubles. This behaviour is not observed in perl. I am > >> curious to know about the difference the in regex implementations of > >> perl > >> and python which causes this. > >> > >> Thanks > >> -- > >> http://mail.python.org/mailman/listinfo/python-list > >> > > > > > > >
-- http://mail.python.org/mailman/listinfo/python-list