Angel Faus: # Since your ops are much complete and better documented that # the ones I sent, # I was trying to adapt my previous regex compiler to your ops, # but I found # what i think might be a limitation of your model. # # It looks to me that for compiling down regexp to usual # opcodes there is the # need of having a generic backtrack, insted of a $backtrack # label for each # case. # # I have been uncapable of expressing nested groups or # alternation with your # model, and I would say that this is because the engine needs # some way to save # not only the index into the string, but also the point of the # regex where it # can branch on a backtack.
I've been a bit worried that this might be the case. The best solution I've been able to think of is to push a "mark" onto the stack, like what Perl 5 does with its call stack to indicate the end of the current function's arguments. If a call to rePopindex popped a mark, it would be considered to have failed, so it would branch to $1 if it had a parameter. # You solve this in your examples, by having a "$bactrack" # address for each # case, but this looks to me as a bad solution. In particular, # i would say that # cannot be aplied for complex regular expressions. # # In my previous experimental patch, there was a way to save # the string index # _plus_ the "regex index". Writing this with your syntax, it # would mean to be # able to add a parametrer in rePushindex that saves the "regex index". # # Your example: # # RE: # reFlags "" # reMinlength 4 # $advance: # rePopindex # reAdvance $fail # $start: # rePushindex # reLiteral "f", $advance # $findo: # literal "o", $findbar # rePushindex # branch $findo # $findbar: # reLiteral "bar", $backtrack # set I0, 1 #true # reFinished # $backtrack: # rePopindex $advance # branch $findbar <<<<<<< backtrack needs to know # where to branch # $fail: # set I0, 0 #false # reFinished # # Your example tweaked by me: # # RE: # reFlags "" # reOnFail $fail # reMinlength 4 # $start: # rePushindex $advance # reLiteral "f" # $findo: # rePushindex $findbar # literal "o" # branch $findo # $findbar: # reLiteral "bar" # set I0, 1 #true # reFinished # $fail: # set I0, 0 #false # reFinished # $advance: # reAdvance # branch $start # # So it is not the reLiteral, reAdvance, etc.. ops that need to # know were they # have to branch on failing, but when failing they always: # # -pop the last index on the stack and then branch to the last saved # destination. # -or branch to the address previously set in reOnFail op if # there are no # pending indexes. # # There is no $bactrack label, but the backtracking action is # called each time # a submatch fails. # # I am not sure that this is the only solution, but is the one # that come to my # mind mind seeing your proposal and I find it quite elegant. Actually, on further examination that mode does appear quite elegant. It also has its problems: /a(?:foo)?b/ With your model: RE: reOnFail $fail reFlags "" reMinlength 2 $start: rePushindex $advance reLiteral "a" rePushindex $continue reLiteral "foo" $continue: reLiteral "b" #and what if this fails? set I0, 1 reFinished $advance: reAdvance branch $start $fail: set I0, 0 reFinished Mine: RE: reFlags "" reMinlength 2 $start: rePushindex reLiteral "a", $advance reLiteral "foo", $continue #i may implement zero-argument versions of this $continue: reLiteral "b", $advance set I0, 1 reFinished $advance: rePopindex reAdvance branch $start $fail: set I0, 0 reFinished Hmm, I expected to see it be much shorter. Perhaps your idea has even more merit than I thought... # It is quite possible that nested groups and alternation can # be implemented # with your model. If that is the case, ¿could you please post # an example so I # can understand?. # # What do you think about it? I think the mark solution may be more flexible: RE: reFlags "" reMinlength 4 $advance: rePopindex reAdvance $fail $start: rePushindex reLiteral "f", $advance rePushmark $findo: reLiteral "o", $findbar rePushindex branch $findo $findbar: reLiteral "bar", $backtrack set I0, 1 #true reFinished $backtrack: rePopindex $advance branch $findbar $fail: set I0, 0 #false reFinished However, this may not be a good example, as I'm seriously looking at the possibility of making reAdvance independent of the stack (cur_re->startindex or something) to ease implementation of reSubst (substitution) and related nonsense. Here's a better example: #/xa*.b*[xb]/ branch $start $advance: reAdvance $fail #no longer using stack in this example $start: reLiteral "x", $advance $finda: reLiteral "a", $findany rePushindex branch $finda $findany: reAnything $backa rePushmark $findb: reLiteral "b", $findxb rePushindex branch $_findb $findxb reOneof "xb", $backb set I0, 1 reFinished $backb: rePopindex $backa branch $findxb $backa: rePopindex $advance branch $findany $fail: set I0, 0 reFinished --Brent Dax [EMAIL PROTECTED] Configure pumpking for Perl 6 When I take action, I’m not going to fire a $2 million missile at a $10 empty tent and hit a camel in the butt. --Dubya