I think you've understood correctly.  Back references mostly aren't
there.  Greedy operators aren't there.  For back references, this may
be due to philosophical reservations; I have a few myself.  For greedy
operators, I suspect it's more because noone has cared enough to do
it.  It wouldn't be too hard, as Russ' article says.  If someone is
going to to this I would suggest going all the way and implementing
tags.  See http://laurikari.net/ville/spire2000-tnfa.ps.

> > well reading the code would be a travesty.  it's curious
> > that neither the sam paper nor regexp(6) mentions
> > submatches.  maybe i missed them.
> >
> > sed -n 's:.*(KRAK[A-Z]+*) +([a-zA-Z]+).*:\2, \1:gp' </lib/volcanoes
> > - erik
> 
> Ok, so despite the documentation, some submatch tracking is there.
> But in all (?) your examples, as well as in the scripts you mentioned,
> this tracking is exclusively used with the s command (which is said to
> be unnecessary at least in sam/acme). If I try sth. like
> /( b(.)b)/a/\1\2/
> on
> bla blb 56
> I get
> bla blb\1\2 56
> which is not quite what I want... How then? (I'd like to get 'bla blblblb 56'
> )
> 
> Further, in R. Cox's text (http://swtch.com/~rsc/regexp/regexp1.html)
> he claims that all nice features except for backreferences can be
> implemented with Thomson's NFA algorithm. And even the backreferences
> can be handled gracefully somehow. That is: ALL: non-greedy operators,
> generalized assertions, counted repetitions, character classes CAN be
> processed using the fast algorithm. Why then we don't have it? I once
> wrote a program in python and was pretty happy to have non-greedy
> operators and lookahead assertions on hand. Should I hadn't had those,
> I probably wouldn't have been able to write it (nicely).
> 
> Ruda
> 
-- 
John Stalker
School of Mathematics
Trinity College Dublin
tel +353 1 896 1983
fax +353 1 896 2282

Reply via email to