Re: Str.trans implementation

Moritz Lenz Thu, 03 Jun 2010 02:13:36 -0700

David Green wrote:
> On 2010-05-31, at 5:32 pm, Chris Fields wrote:
>> I think, in order to get regexes to work we will need a way of getting the 
>> name of the matching regex from the Match object somehow.  Any idea how to 
>> do that?


afaict there's no direct way, only workarounds (as David showed).

> I had started working on this too, and ended up going on quite an 
> adventure... 
> my starting idea was something like s/ @from / %to{$/} / to look up the 
> replacement
> based on what was matched, but $/ isn't available in a subst, 

It is, but currently only as the first positional argument of the closure.

Example:

10:54 <@moritz_> rakudo: say 'a b c'.subst(/<alpha>/, -> $m { uc $m }, :g)
10:54 <+p6eval> rakudo a1140c: OUTPUT«A B C␤»


> and anyway, as you indicate, $/ would only get the matched text, not the 
> regex's name or other identifier.
> 
> So I thought of what seemed like a good trick: have each pattern set its own 
> replacement through the magic of embedded closures:
>       / @from[$n] { $replacement = $to[$n] } /
> 
> As long as $replacement had the right scope, it would get the correct value 
> as a side-effect of matching, 
> and then could be substituted for the right thing.  Except it turns out 
> .subst does all the matching first,
> and then the replacement, so everything got replaced with whatever the final 
> match happened to be.

... unless you push all the replacement markers onto an array, and
traverse the array during the substitution phase.

Maybe an explanation is in order why .subst matches eagerly, and only
then replaces:

The :x modifier takes a number or range of match numbers, for example

'a b c d e'.subst(/<alpha>/, 'Z', :x(3))

is supposed to return

'Z Z Z d e'.

If less than $x matches are present, no substitution is carried out at all.


> However, I was stubborn enough to look at the code for .subst, and 
> saw that instead of passing off :g to self.match, I could loop through
> the individual matches manually, and replace each one as it came up, /SK
> thus using the correct value of $replacement.  So I did. 
> 
> Of course, I was using :c to continue matching each successive occurrence
> from where the previous one left off, so I had to handle any :c option 
> manually too.  And :p.  And :x and :n... well, it started to feel a bit 
> silly, especially since .subst is going to change to make $/ work anyway. 

I don't like the code duplication in .subst and .match, I'd rather see
.trans working with the workarounds I mentioned above than re-writing
.subst to re-implement much of .match.

> But... by then, I had it almost working.  So I've also included my mangled 
> version of .subst. (It passes the same spectests from subst.t that it did 
> before, except for a couple that I'm not sure are right.)

I'll look through the tests and weed out the old, wrong tests.

> And once .subst was able to re-evaluate the RHS for each match, my .trans 
> worked.  Almost.  Interpolating a literal string doesn't seem to work yet, so 
> I had to escape all the chars and interpolate it as a rule.  And then there 
> was a weird bug where creating regexes in a loop returned a list of copies of 
> the same regex.  But it worked spelling them all out without using a loop, so 
> I made a long string with all the regexes I needed and then eval()ed, and 
> that *did* work.  
> 
> But at least the error messages were better, and just since last week!  
> (Hooray for line numbers!)  

Aye, kudos to Jonathan Worthington for these lovely backtraces!

> As you can see, I didn't pay attention to optimisation, but it does pass the 
> spectests that use .trans (rather than tr//); and I didn't do any adverbs.  
> (I reckon the P5 modifiers were dropped because they aren't as useful now: :c 
> and :s can be handled with regexes as the search key, and :d is now done 
> simply by using an empty string as the replacement value.)

Fine by me. The spec doesn't mention them, just the tests.

Cheers,
Moritz

Re: Str.trans implementation

Reply via email to