Angel Faus:
# Since your ops are much complete and better documented that
# the ones I sent,
# I was trying to adapt my previous regex compiler to your ops,
# but I found
# what i think might be a limitation of your model.
#
# It looks to me that for compiling down regexp to usual
# opcodes there is the
# need of having a generic backtrack, insted of a $backtrack
# label for each
# case.
#
# I have been uncapable of expressing nested groups or
# alternation with your
# model, and I would say that this is because the engine needs
# some way to save
# not only the index into the string, but also the point of the
# regex where it
# can branch on a backtack.

I've been a bit worried that this might be the case.  The best solution
I've been able to think of is to push a "mark" onto the stack, like what
Perl 5 does with its call stack to indicate the end of the current
function's arguments.  If a call to rePopindex popped a mark, it would
be considered to have failed, so it would branch to $1 if it had a
parameter.

# You solve this in your examples, by having a "$bactrack"
# address for each
# case, but this looks to me as a bad solution. In particular,
# i would say that
# cannot be aplied for complex regular expressions.
#
# In my previous experimental patch, there was a way to save
# the string index
# _plus_ the "regex index". Writing this with your syntax, it
# would mean to be
# able to add a parametrer in rePushindex that saves the "regex index".
#
# Your example:
#
# RE:
#         reFlags ""
#         reMinlength 4
# $advance:
#         rePopindex
#         reAdvance $fail
# $start:
#         rePushindex
#         reLiteral "f", $advance
# $findo:
#         literal "o", $findbar
#         rePushindex
#         branch $findo
# $findbar:
#         reLiteral "bar", $backtrack
#         set I0, 1     #true
#         reFinished
# $backtrack:
#         rePopindex $advance
#         branch $findbar     <<<<<<< backtrack needs to know
# where to branch
# $fail:
#         set I0, 0     #false
#         reFinished
#
# Your example tweaked by me:
#
# RE:
#         reFlags ""
#         reOnFail $fail
#         reMinlength 4
# $start:
#         rePushindex $advance
#         reLiteral "f"
# $findo:
#         rePushindex $findbar
#         literal "o"
#         branch $findo
# $findbar:
#         reLiteral "bar"
#         set I0, 1     #true
#         reFinished
# $fail:
#         set I0, 0     #false
#         reFinished
# $advance:
#         reAdvance
#         branch $start
#
# So it is not the reLiteral, reAdvance, etc.. ops that need to
# know were they
# have to branch on failing, but when failing they always:
#
#   -pop the last index on the stack and then branch to the last saved
# destination.
#   -or branch to the address previously set in reOnFail op if
# there are no
# pending indexes.
#
# There is no $bactrack label, but the backtracking action is
# called each time
# a submatch fails.
#
# I am not sure that this is the only solution, but is the one
# that come to my
# mind mind seeing your proposal and I find it quite elegant.

Actually, on further examination that mode does appear quite elegant.
It also has its problems:

        /a(?:foo)?b/

With your model:

RE:
        reOnFail $fail
        reFlags ""
        reMinlength 2
$start:
        rePushindex $advance
        reLiteral "a"
        rePushindex $continue
        reLiteral "foo"
$continue:
        reLiteral "b"   #and what if this fails?
        set I0, 1
        reFinished
$advance:
        reAdvance
        branch $start
$fail:
        set I0, 0
        reFinished


Mine:

RE:
        reFlags ""
        reMinlength 2
$start:
        rePushindex
        reLiteral "a", $advance
        reLiteral "foo", $continue      #i may implement zero-argument versions of
this
$continue:
        reLiteral "b", $advance
        set I0, 1
        reFinished
$advance:
        rePopindex
        reAdvance
        branch $start
$fail:
        set I0, 0
        reFinished

Hmm, I expected to see it be much shorter.  Perhaps your idea has even
more merit than I thought...

# It is quite possible that nested groups and alternation can
# be implemented
# with your model. If that is the case, ¿could you please post
# an example so I
# can understand?.
#
# What do you think about it?

I think the mark solution may be more flexible:

RE:
    reFlags ""
    reMinlength 4
$advance:
    rePopindex
    reAdvance $fail
$start:
    rePushindex
    reLiteral "f", $advance
    rePushmark
$findo:
    reLiteral "o", $findbar
    rePushindex
    branch $findo
$findbar:
     reLiteral "bar", $backtrack
     set I0, 1  #true
     reFinished
 $backtrack:
     rePopindex $advance
     branch $findbar
 $fail:
     set I0, 0  #false
     reFinished

However, this may not be a good example, as I'm seriously looking at the
possibility of making reAdvance independent of the stack
(cur_re->startindex or something) to ease implementation of reSubst
(substitution) and related nonsense.  Here's a better example:

        #/xa*.b*[xb]/
        branch $start
$advance:
        reAdvance $fail #no longer using stack in this example
$start:
        reLiteral "x", $advance
$finda:
        reLiteral "a", $findany
        rePushindex
        branch $finda
$findany:
        reAnything $backa
        rePushmark
$findb:
        reLiteral "b", $findxb
        rePushindex
        branch $_findb
$findxb
        reOneof "xb", $backb
        set I0, 1
        reFinished
$backb:
        rePopindex $backa
        branch $findxb
$backa:
        rePopindex $advance
        branch $findany
$fail:
        set I0, 0
        reFinished

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
    --Dubya

Reply via email to