Re: perl6-regex: retaining $/.pos after an unsuccesful match without a temporary variable?

William Michels via perl6-users Wed, 21 Aug 2019 17:02:26 -0700

Hi Raymond, Wow that's exciting! I'm sure others will chime in with
their thoughts.


I wrote two more test cases for your "incremental P5-like parser",
that can be appended to the code you posted yesterday (personally I
think of incremental matching as being important for matching the
linear order of words in a string). One thing I've noticed is the
tests don't necessarily return in the order in which they appear in
the code. (Hence the short textual descriptions, to disambiguate).
Cheers, Bill.

####

# test linear order of words in a string
# (whitespace separated)

my $test1 = "      dewey defeats truman ";
my $mp1 = MatchProxy.new: subject => $test1;

#note below: per Raymond Dresens' example,
#use "die if" to test against "willnotmatch" strings

die unless $mp1 ~~~ / \s+ /;
die unless $mp1 ~~~ / dewey \s+/;
die unless $mp1 ~~~ / defeats \s+ /;
die unless $mp1 ~~~ / truman \s+ /;

say "published headline (nchar)= " ~ $mp1.P; # yields "27"


my $test2 = "      truman defeats dewey ";
my $mp2 = MatchProxy.new: subject => $test2;

#note below: per Raymond Dresens' example,
#use "die if" to test against "willnotmatch" strings

die unless $mp2 ~~~ / \s+ /;
die unless $mp2 ~~~ / dewey \s+/;  # Dies here
die unless $mp2 ~~~ / defeats \s+ /;
die unless $mp2 ~~~ / truman \s+ /;

say "actual election result (nchar)= " ~ $mp2.P; # Dies trying to match "dewey"

####



On Tue, Aug 20, 2019 at 12:00 PM Raymond Dresens
<raymond.dres...@gmail.com> wrote:
>
> Hello everyone, and thanks everyone for their comments and code snippets with 
> full of syntax that I haven't discovered as of yet,
>
> Today I managed to figure out how my provided example code could be rewritten 
> in Perl 6 almost 'verbatim': see the program below in this message. I 
> implemented a '~~~' operator; overloading '~~' or implementing '=~' was 
> problematic ;) see the comments in the code below. Any insights or 'tricks 
> around this' are of course welcome...
>
> ...apart from the question if it is possible to plug this/my "apropos" or 
> "apos" kind of approach into the regex system by implementing my own 
> adverb-kind-of-extension thing? Like: adding "apos" alongside ("pos" an 
> "continue")?
>
> Anyway: the 'MatchProxy' class seems to enable me to easily rewrite my 
> various 'incremental Perl 5 parsers' that I've concocted in the past, but my 
> next step will be diving into grammars. My 'incremental parsers' are actually 
> mostly "if ... elsif ... elsif ... else { die 'parser error' }" trees in a 
> while loop with a 'charsLeft'-check (see the class below) as its condition. 
> "tree switching" (or: "mode switching") can be done by logic intermingled 
> with (and steered by) the "tokenization process". This allows me to "shift 
> languages/syntaxes", to to speak, possibly steered by kind-of 
> preprocessor-like directives.
>
> Perhaps I shouldn't ask this (I should dive into grammars first).... but I 
> was semi-aware of the fact that I could insert code inside Perl 6 regexes, 
> but mainly in order to interpolate those "evaluation results" into the 
> regex... but using that for control flow in grammars, controlling the class 
> that processes the tokens, is that possible/wise?
>
> I hope to find out soon. ;)
>
> Regards,
>
> Raymond
>
>
> #!/usr/bin/env perl6
>
> # 
> ..............................................................................
>
> use v6.d;
>
> class MatchProxy
> {
>     has $.P is rw = 0;
>     has $.matches is rw;
>     has $.subject;
>
>     method apropos(Regex $rx)
>     {
>         if ($.subject ~~ m:pos($.P)/ $0 = $rx /)
>         {
>             $.matches = $/[0]; $.P = $/.pos
>         }
>     }
>
>     method charsLeft
>     {
>         $.P < $.matches.chars
>     }
> }
>
> # Overloading '~~' is not possible: ``Cannot override infix operator '~~', as
> # it is a special form handled directly by the compiler'', and...
>
> # Overloading/implementing '=~' //seems// possible... but using it seems
> # prohibited: ``Unsupported use of =~ to do pattern matching; in Perl 6
> # please use ~~'', so...
>
> sub infix:<~~~>(MatchProxy $mp, Regex $rx)
> {
>     $mp.apropos: $rx
> }
>
>
>
> my $test = "      foo bar";
>
> my $mp = MatchProxy.new: subject => $test;
>
> # The '^' and '\G' zero-width assertions e.g. anchors from my Perl 5 snippet
> # are removed since they aren't needed any more in this rewrite.
>
> die unless $mp ~~~ / \s+ /;
>
> die unless $mp ~~~ / foo \s+/;
>
> die if     $mp ~~~ / willnotmatch /;
>
> die unless $mp ~~~ / bar /;
>
> say $mp.P; # yields "13"
>
> # 
> ..............................................................................
>
> On Tue, 20 Aug 2019 at 01:13, William Michels <w...@caa.columbia.edu> wrote:
>>
>> Thanks to Brad Gilbert's code contribution in this thread, I re-wrote
>> a small snippet of his code (code that incrementally checks a series
>> of regex matches), to have it return the last position of each match.
>> Testing with three 'matches' and one 'willnotmatch' returns three
>> positional values, as expected:
>>
>> use v6
>>   my $test = "      foo bar";
>>
>> sub foo($x) {
>>   state @a = 0;
>>     $x ~~ m /^\s+  {@a.push($/.pos)}/;
>>     $x ~~ m :pos(@a[*-1]) /foo\s+  {@a.push($/.pos)}/;
>>     $x ~~ m :pos(@a[*-1]) /willnotmatch  {@a.push($/.pos)}/;
>>     $x ~~ m :pos(@a[*-1]) /bar   {@a.push($/.pos)}/;
>>   return @a[1 .. *];
>> }
>>
>>   #say foo($test); # returns (6 10 13)
>>   put foo($test); # returns 6 10 13
>>
>>
>> I'm actually pleasantly surprised that I can add a dozen or so
>> 'willnotmatch' lines, and it doesn't screw up the result. The next
>> step might be to 1). pull the individual regexes out into an object
>> (as suggested in the SO post below) to simplify each smartmatch line,
>> and/or 2). store the results in a hash (instead of an array), for
>> later substring extraction. But at this point it seems I'm getting
>> into 'Grammar' territory, so that might be the better approach.
>>
>> HTH, Bill.
>>
>> https://stackoverflow.com/questions/50829126/perl6-interpolate-array-in-match-for-and-or-not-functions/50838441#50838441
>>
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Aug 19, 2019 at 1:08 AM Patrick Spek via perl6-users
>> <perl6-us...@perl.org> wrote:
>> >
>> > On Sun, 18 Aug 2019 13:45:27 -0300
>> > Aureliano Guedes <guedes.aureli...@gmail.com> wrote:
>> >
>> > > Even being another language, Perl6 should be inheriting Perl5's
>> > > regexes or even improving it not making it uglier and harder.
>> > >
>> > > Or I'm seeing how to use it in an easy way. Also, dunno if there is
>> > > some GOOD documentation to Perl6 regexes.
>> >
>> > Beauty is in the eye of the beholder. While I'm much more proficient
>> > with PCRE than P6CRE, I do find the Perl 6 variants to be much cleaner
>> > and easier to understand when reading regexes of others.
>> >
>> > If you find that there's a lack of documentation explaining things
>> > clearly to you, that'd be an issue to solve in the documentation. This
>> > takes a lot of effort, and if you would be so kind as to improve it
>> > where you think it's needed, it would be a great help to everyone (we
>> > can't really see how or where you're looking for what, after all).
>> >
>> > --
>> > With kind regards,
>> >
>> > Patrick Spek
>> >
>> >
>> > www:  https://www.tyil.nl/
>> > mail: p.s...@tyil.nl
>> > pgp:  1660 F6A2 DFA7 5347 322A  4DC0 7A6A C285 E2D9 8827
>> >
>> > social: https://soc.fglt.nl/tyil
>> > git:    https://gitlab.com/tyil/

Re: perl6-regex: retaining $/.pos after an unsuccesful match without a temporary variable?

Reply via email to