Brent Dax : > Okay, this bunch of ops is a serious attempt at regular expressions. I > had a discussion with japhy on this in the Monastery > (http://www.perlmonks.org/index.pl?node_id=122784), and I've come up > with something flexible enough to actually (maybe) work. Attached is a > patch to modify core.ops and add re.h (defines data structures and such) > and t/op/re.t (six tests). All tests, including the new ones, pass.
Hi Brent, Since your ops are much complete and better documented that the ones I sent, I was trying to adapt my previous regex compiler to your ops, but I found what i think might be a limitation of your model. It looks to me that for compiling down regexp to usual opcodes there is the need of having a generic backtrack, insted of a $backtrack label for each case. I have been uncapable of expressing nested groups or alternation with your model, and I would say that this is because the engine needs some way to save not only the index into the string, but also the point of the regex where it can branch on a backtack. You solve this in your examples, by having a "$bactrack" address for each case, but this looks to me as a bad solution. In particular, i would say that cannot be aplied for complex regular expressions. In my previous experimental patch, there was a way to save the string index _plus_ the "regex index". Writing this with your syntax, it would mean to be able to add a parametrer in rePushindex that saves the "regex index". Your example: RE: reFlags "" reMinlength 4 $advance: rePopindex reAdvance $fail $start: rePushindex reLiteral "f", $advance $findo: literal "o", $findbar rePushindex branch $findo $findbar: reLiteral "bar", $backtrack set I0, 1 #true reFinished $backtrack: rePopindex $advance branch $findbar <<<<<<< backtrack needs to know where to branch $fail: set I0, 0 #false reFinished Your example tweaked by me: RE: reFlags "" reOnFail $fail reMinlength 4 $start: rePushindex $advance reLiteral "f" $findo: rePushindex $findbar literal "o" branch $findo $findbar: reLiteral "bar" set I0, 1 #true reFinished $fail: set I0, 0 #false reFinished $advance: reAdvance branch $start So it is not the reLiteral, reAdvance, etc.. ops that need to know were they have to branch on failing, but when failing they always: -pop the last index on the stack and then branch to the last saved destination. -or branch to the address previously set in reOnFail op if there are no pending indexes. There is no $bactrack label, but the backtracking action is called each time a submatch fails. I am not sure that this is the only solution, but is the one that come to my mind mind seeing your proposal and I find it quite elegant. It is quite possible that nested groups and alternation can be implemented with your model. If that is the case, ¿could you please post an example so I can understand?. What do you think about it? -angel