On Feb 24, 10:15 am, [EMAIL PROTECTED] wrote: > On Feb 21, 10:34 am, [EMAIL PROTECTED] wrote: > > > On Feb 20, 6:14 pm, Pop User <[EMAIL PROTECTED]> wrote: > > >http://swtch.com/~rsc/regexp/regexp1.html > > Going back a bit on a tangent, the author of this citation states that > any regex can be expressed as a DFA machine. However, while > investigating this more I appear to have found one example of a regex > which breaks this assumption. > > "ab+c|abd" > > Am I correct?
No. > Can you think of a deterministic method of computing > this expression? Firstly rewrite a bit: ab+c|abd a(b+c|bd) a(bb*c|bd) ab(b*c|d) ab(b+c|c|d) Here's a DFA that recognises that: State 0: a -> 1 State 1: b -> 2 State 2: b -> 3 # start of the b+c branch c -> 4 # the c branch d -> 4 # the d branch State 3: b -> 3 c -> 4 State 4: accepting state > It would be easier with a NFA machine, What is "It"? > but given that > the Python method of computing RE's involves pre-compiling a re > object, AFAIK all serious regex engines precompile a regex into an internal representation which can be saved for reuse. > optimizing the matching engine What does that mean? > would make the most sense to > me. Compared to what alternatives? > > Here's what I have so far: [big snip] > >>> a = compile("ab+c|abd") [snip] > >>> a.match("abbd") > Match! Bzzzt. Neither "ab+c" nor "abd" match a prefix of "abbd". HTH, John -- http://mail.python.org/mailman/listinfo/python-list