[perl #58952] Implemented second argument to comb
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #58952] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=58952 > Here is the implementation of the second argument to the comb method as described in S29. Here is a test I wrote (which I'll commit to the pugs test suite later): my Str $hair = "Th3r3 4r3 s0m3 numb3rs 1n th1s str1ng"; say $hair.comb(/\d+/); say $hair.comb(/\d+/, -10); say $hair.comb(/\d+/, 0); say $hair.comb(/\d+/, 1); say $hair.comb(/\d+/, 3); say $hair.comb(/\d+/, 9); say $hair.comb(/\d+/, 1); The output being: 3343033111 3 334 334303311 3343033111 Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31199) +++ src/builtins/any-str.pir (working copy) @@ -45,6 +45,8 @@ .sub comb :method :multi(_) .param pmc regex +.param int count:optional +.param int has_count:opt_flag .local pmc retv, match .local string s @@ -54,6 +56,10 @@ do_match: match = regex.'ACCEPTS'(s) unless match goto done +unless has_count goto skip_count +count -= 1 +if count < 0 goto done + skip_count: # shouldn't have to coerce to Str here, but see RT #55962 $S0 = match retv.'push'($S0)
Questions about :multi in Method Signatures
I am confused about how we should setup method signatures: Let's take a look at a line in any-str.pir: 46 .sub 'comb' :method :multi(_) 47 .param pmc regex 48 .param int count:optional 49 .param int has_count:opt_flag As you can see we have one parameter specified in :multi which is _ (any type). However we also have two .param lines, 47 and 48. So here are some questions: I noticed 'self' is implicitly defined, however does 'self' eat up a parameter? Should we always have one parameter in :multi specified for the object the method is running on? Also, I played around with :multi by putting in different things. With the above method here is what I tried and the result :multi(_) - works :multi(_, _) works :multi(_,_,_) doesn't work :multi(_,Integer) doesn't work :mult(Sub) doesn't work Some clarification surrounding the use of :multi would help a lot. Best Regards, -Chris Davaz
Re: Questions about :multi in Method Signatures
That's a great response, thanks. Clears things up. One question, should be always be using _ for the invocant or should we try to restrict it? On Wed, Sep 17, 2008 at 10:52 PM, Patrick R. Michaud <[EMAIL PROTECTED]>wrote: > On Wed, Sep 17, 2008 at 08:37:36PM +0800, Chris Davaz wrote: > > I am confused about how we should setup method signatures: > > > > Let's take a look at a line in any-str.pir: > > > > 46 .sub 'comb' :method :multi(_) > > 47 .param pmc regex > > 48 .param int count:optional > > 49 .param int has_count:opt_flag > > > > As you can see we have one parameter specified in :multi which is _ (any > > type). However we also have two .param lines, 47 and 48. So here are some > > questions: > > > > I noticed 'self' is implicitly defined, however does 'self' eat up a > > parameter? Should we always have one parameter in :multi specified for > the > > object the method is running on? > > Yes, the first argument of the :multi refers to the invocant. > > > Also, I played around with :multi by putting in different things. With > the > > above method here is what I tried and the result > > > > :multi(_) - works > > Restricts the sub to being invoked by callers supplying > at least one argument (in this case, the argument is the > invocant, since the sub is declared as :method). > > > :multi(_, _) works > > This says that the sub requires at least two arguments (of any > type). The first will go into 'self', the second into 'regex'. > > > :multi(_,_,_) doesn't work > > This says that the sub requires at least three arguments (of > any type). This doesn't really match the sub definition though, > which has an invocant, a required argument, and an optional argument. > > > :multi(_,Integer) doesn't work > > This says the invocant may be of any type, and the first argument > must be an Integer (or a subclass of Integer). Probably not what > we want given that the first parameter of the sub is 'regex'. > > > > :mult(Sub) doesn't work > > This says that the sub can be invoked only on Sub invocants. > > > Some clarification surrounding the use of :multi would help a lot. > > Hope the above helps. I don't know where :multi is documented in > Parrot itself; the pdd27 file doesn't provide much detail. > > Pm >
[perl #58970] Initial implementation of Str.split(Regex)
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #58970] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=58970 > I say "initial" because it didn't pass one of my tests. This might be due to regular expressions not being fully implemented, so if this is the case please let me know. Once I know I'll fix my code if need by and write an appropriate test case and add it to the pugs repo. my Str $test = "theXbiggestXbangXforXtheXbuck"; my List $list = $test.split(/X/); print $list.join("\n"); $test = "tagstripping"; # Expect the print to give us "tag stripping" however it just yields some whitespace, not sure why $list = $test.split(/\<\/?.*?\>/); print $list.join(" "); Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31205) +++ src/classes/Str.pir (working copy) @@ -46,7 +46,7 @@ .return(retv) .end -.sub 'split' :method :multi('String') +.sub 'split' :method :multi(_, 'String') .param string delim .local string objst .local pmc pieces @@ -76,6 +76,50 @@ .return(retv) .end +# split a string on a regex +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc tmpstr +.local pmc retv +.local int start_pos +.local int end_pos + +retv = new 'List' + +match = regex.'ACCEPTS'(self) +unless match goto done + +start_pos = 0 +end_pos = match.'from'() + + loop: +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + +start_pos = match.'to'() + +match.'next'() + +end_pos = match.'from'() +end_pos -= start_pos + +$S1 = match.'text'() +if $S1 == '' goto last +goto loop + + last: +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + + done: +.return(retv) +.end + .sub lc :method .local string tmps .local pmc retv
[perl #58970] Initial implementation of Str.split(Regex)
Ok, here it is without the change to "split on a string", and the test passes. Please apply this one and in the meantime I will see how we can get the method signature right for split on a strong + not break reverse. On Thu, Sep 18, 2008 at 1:57 AM, Moritz Lenz <[EMAIL PROTECTED]>wrote: > Chris Davaz (via RT) wrote: > > # New Ticket Created by "Chris Davaz" > > # Please include the string: [perl #58970] > > # in the subject line of all future correspondence about this issue. > > # http://rt.perl.org/rt3/Ticket/Display.html?id=58970 > > > > > > > I say "initial" because it didn't pass one of my tests. > > More importantly it introduces a regression in t/spec/S29-list/reverse.t > (Str.reverse is implemented in terms of split-into-characters -> list > reverse -> join; the split failed with your patch, thus causing a test > failure in 'make spectest_regression') > > > This might be due to > > regular expressions not being fully implemented, so if this is the case > > please let me know. > > That's unlikely (but not impossible, of course), the regular rexpression > engine is quite good and well-tested. I'll take a look into it later. > > Anyway, nice patch, and if Str.reverse is fixed it'll certainly be applied. > > Moritz > > -- > Moritz Lenz > http://moritz.faui2k3.org/ | http://perl-6.de/ > Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31220) +++ src/classes/Str.pir (working copy) @@ -76,6 +76,50 @@ .return(retv) .end +# split a string on a regex +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc tmpstr +.local pmc retv +.local int start_pos +.local int end_pos + +retv = new 'List' + +match = regex.'ACCEPTS'(self) +unless match goto done + +start_pos = 0 +end_pos = match.'from'() + + loop: +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + +start_pos = match.'to'() + +match.'next'() + +end_pos = match.'from'() +end_pos -= start_pos + +$S1 = match.'text'() +if $S1 == '' goto last +goto loop + + last: +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + + done: +.return(retv) +.end + .sub lc :method .local string tmps .local pmc retv
Re: [perl #58970] Initial implementation of Str.split(Regex)
hmm I see I'll work it out ;-) Thanks On Thu, Sep 18, 2008 at 3:33 PM, Moritz Lenz <[EMAIL PROTECTED]>wrote: > >> Chris Davaz wrote: >> > Ok, here it is without the change to "split on a string", and the test >> > passes. >> >> Yes, but on my machine now t/spec/S04-statements/gather.t produces a >> segmentation fault. I'll have to investigate if that is related to your >> patch or not. >> >> Also split behaves a bit strangely here: >> >> > say 'ab23d4f5'.split(/\d+/).perl >> ["ab", "", "", "d", "f", ""] >> > say 'ab23d4f5'.split(/\d/).perl >> ["ab", "", "d", "f", ""] >> >> (There are a few other oddities like the behaviour with a zero-width >> match, but that's only a minor issue). >> >> > Please apply this one and in the meantime I will see how we can >> > get the method signature right for split on a strong + not break >> reverse. >> >> -- >> Moritz Lenz >> http://moritz.faui2k3.org/ | http://perl-6.de/ >> > >
Expected behavior of match?
The attached split.diff file is just for demonstration, not a patch submittal. I made a method on Str called "match" that returns a List of all matches: # returns all matches on a regex .sub 'match' :method :multi(_, 'Sub') .param pmc regex .local pmc match .local pmc tmpstr .local pmc retv retv = new 'List' match = regex.'ACCEPTS'(self) loop: $S0 = match.'text'() if $S0 == '' goto done retv.'push'($S0) match.'next'() goto loop done: .return(retv) .end However, running the following code: say "ab1cd12ef123gh".match(/\d+/).perl; We get: ["1", "12", "1", "2", "123", "12", "1", "23", "2", "3"] As you can see match is returning what seems to be both greedy *and* non-greedy matches (and everything in between). Is this correct? Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31220) +++ src/classes/Str.pir (working copy) @@ -76,6 +76,78 @@ .return(retv) .end +# returns all matches on a regex +.sub 'match' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc tmpstr +.local pmc retv + +retv = new 'List' +match = regex.'ACCEPTS'(self) + +loop: + $S0 = match.'text'() + if $S0 == '' goto done + retv.'push'($S0) + match.'next'() + goto loop + +done: + .return(retv) +.end + +# split a string on a regex +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc tmpstr +.local pmc retv +.local int start_pos +.local int end_pos + +retv = new 'List' + +match = regex.'ACCEPTS'(self) +unless match goto done + +start_pos = 0 +end_pos = match.'from'() + + loop: +print "start_pos = " +print start_pos +print "; end_pos = " +say end_pos +$S7 = match.'text'() +say $S7 + +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + +start_pos = match.'to'() + +match.'next'() + +end_pos = match.'from'() +end_pos -= start_pos + +$S1 = match.'text'() +if $S1 == '' goto last +goto loop + + last: +tmpstr = new 'Perl6Str' +$S0 = substr self, start_pos, end_pos +tmpstr = $S0 +retv.'push'(tmpstr) + + done: +.return(retv) +.end + .sub lc :method .local string tmps .local pmc retv split.p6 Description: Binary data
S05 and S29 may conflict on behavior of $string.match(/pat/)
I'm trying to pin down what $string.match(/pat/) should be returning. >From S05: Under "Return values from match objects" "A match always returns a Match object..." >From S29: Under the definition of Str.comb Saying $string.comb(/pat/, $n) is equivalent to $string.match(rx:global:x(0..$n):c/pat/) [ ...and later... ] "If there are captures in the pattern, a list of Match objects (one per match) is returned instead of strings." Which implies that $string.match(/pat/) should indeed return a List of Str and $string.match(/pat_with_groups/) should return a List of Match. I expected the S29 definition when first approaching $string.match I feel it is more intuitive than what happens with S05. Could someone clarify what the behavior should be? Best Regards, -Chris Davaz
Re: [perl #58970] Initial implementation of Str.split(Regex)
Here is the new patch for split on a regex. Testing it with: say "theXbiggestXbangXforXtheXbuck".split(/X/).perl; say "ab1cd12ef123gh".split(/\d+/).perl; say "charsoup".split(/\<\/?.*?\>/).perl; We get the following output: ["the", "biggest", "bang", "for", "the", "buck"] ["ab", "cd", "ef", "gh"] ["", "char", "", "soup", ""] I'll upload a test to pugs later. On Thu, Sep 18, 2008 at 3:33 PM, Moritz Lenz <[EMAIL PROTECTED]> wrote: > Chris Davaz wrote: >> Ok, here it is without the change to "split on a string", and the test >> passes. > > Yes, but on my machine now t/spec/S04-statements/gather.t produces a > segmentation fault. I'll have to investigate if that is related to your > patch or not. > > Also split behaves a bit strangely here: > > > say 'ab23d4f5'.split(/\d+/).perl > ["ab", "", "", "d", "f", ""] > > say 'ab23d4f5'.split(/\d/).perl > ["ab", "", "d", "f", ""] > > (There are a few other oddities like the behaviour with a zero-width > match, but that's only a minor issue). > >> Please apply this one and in the meantime I will see how we can >> get the method signature right for split on a strong + not break reverse. > > -- > Moritz Lenz > http://moritz.faui2k3.org/ | http://perl-6.de/ > Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31220) +++ src/classes/Str.pir (working copy) @@ -76,6 +76,35 @@ .return(retv) .end +# split a string on a regex +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc tmpstr +.local pmc retv +.local int start_pos +.local int end_pos + +$S0 = self +retv = new 'List' +start_pos = 0 + + loop: +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +end_pos -= start_pos +tmpstr = new 'Perl6Str' +$S1 = substr $S0, start_pos, end_pos +tmpstr = $S1 +retv.'push'(tmpstr) +unless match goto done +start_pos = match.'to'() +goto loop + + done: +.return(retv) +.end + .sub lc :method .local string tmps .local pmc retv
Made some fixes to split on a regex and moved from Str.pir to any-str.pir
Got rid of "tempstr" and now returns the entire string on a non-match. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31220) +++ src/builtins/any-str.pir (working copy) @@ -71,7 +71,42 @@ .return(retv) .end +=item split() +Splits something on a regular expresion + +=cut + +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc retv +.local int start_pos +.local int end_pos + +$S0 = self +retv = new 'List' +start_pos = 0 + +match = regex($S0) +if match goto loop +retv.'push'($S0) +goto done + + loop: +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +end_pos -= start_pos +$S1 = substr $S0, start_pos, end_pos +retv.'push'($S1) +unless match goto done +start_pos = match.'to'() +goto loop + + done: +.return(retv) +.end + =item index() =cut split.p6 Description: Binary data
Re: Made some fixes to split on a regex and moved from Str.pir to any-str.pir
Sorry forgot to put the method in alphabetical order, here you go. On Fri, Sep 19, 2008 at 12:36 AM, Chris Davaz <[EMAIL PROTECTED]> wrote: > Got rid of "tempstr" and now returns the entire string on a non-match. > Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31220) +++ src/builtins/any-str.pir (working copy) @@ -71,7 +71,6 @@ .return(retv) .end - =item index() =cut @@ -173,6 +172,42 @@ .return ($P0) .end +=item split(/PATTERN/) + +Splits something on a regular expresion + +=cut + +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc retv +.local int start_pos +.local int end_pos + +$S0 = self +retv = new 'List' +start_pos = 0 + +match = regex($S0) +if match goto loop +retv.'push'($S0) +goto done + + loop: +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +end_pos -= start_pos +$S1 = substr $S0, start_pos, end_pos +retv.'push'($S1) +unless match goto done +start_pos = match.'to'() +goto loop + + done: +.return(retv) +.end + =item substr() =cut
Re: S05 and S29 may conflict on behavior of $string.match(/pat/)
Thanks for clarifying however I'm still unsure what a Perl 6 user should expect to get back from running $string.match(/pat/). This is the ""one high-level call to the .match method" yes? So it should be returning a List of Str (or List of Match in case of capture groups), is this correct? I ask because in the current Rakudo implementation it returns the Match object (what I would expect from the "one low-level run of the regex engine"). Best Regards, -Chris On Thu, Sep 18, 2008 at 11:52 PM, Larry Wall <[EMAIL PROTECTED]> wrote: > On Thu, Sep 18, 2008 at 06:11:45PM +0800, Chris Davaz wrote: > : I'm trying to pin down what $string.match(/pat/) should be returning. > : > : >From S05: > : > : Under "Return values from match objects" > : "A match always returns a Match object..." > : > : >From S29: > : > : Under the definition of Str.comb > : > : Saying > : > : $string.comb(/pat/, $n) > : > : is equivalent to > : > : $string.match(rx:global:x(0..$n):c/pat/) > : > : [ ...and later... ] > : > : "If there are captures in the pattern, a list of Match objects (one > : per match) is returned instead of strings." > : > : Which implies that $string.match(/pat/) should indeed return a List of > : Str and $string.match(/pat_with_groups/) should return a List of > : Match. > : > : I expected the S29 definition when first approaching $string.match I > : feel it is more intuitive than what happens with S05. Could someone > : clarify what the behavior should be? > > S05 is using a different definition of "match". In S05 it means > more like "one low-level run of the regex engine" rather than "one > high-level call to the .match method". In other words, the .match > method can do multiple matches. > > Larry >
[perl #59014] Made some fixes to split on a regex and moved from Str.pir to any-str.pir
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59014] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59014 > Got rid of "tempstr" and now returns the entire string on a non-match. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31220) +++ src/builtins/any-str.pir (working copy) @@ -71,7 +71,42 @@ .return(retv) .end +=item split() +Splits something on a regular expresion + +=cut + +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc retv +.local int start_pos +.local int end_pos + +$S0 = self +retv = new 'List' +start_pos = 0 + +match = regex($S0) +if match goto loop +retv.'push'($S0) +goto done + + loop: +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +end_pos -= start_pos +$S1 = substr $S0, start_pos, end_pos +retv.'push'($S1) +unless match goto done +start_pos = match.'to'() +goto loop + + done: +.return(retv) +.end + =item index() =cut split.p6 Description: Binary data
[perl #59016] Re: Made some fixes to split on a regex and moved from Str.pir to any-str.pir
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59016] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59016 > Sorry forgot to put the method in alphabetical order, here you go. On Fri, Sep 19, 2008 at 12:36 AM, Chris Davaz <[EMAIL PROTECTED]> wrote: > Got rid of "tempstr" and now returns the entire string on a non-match. > Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31220) +++ src/builtins/any-str.pir (working copy) @@ -71,7 +71,6 @@ .return(retv) .end - =item index() =cut @@ -173,6 +172,42 @@ .return ($P0) .end +=item split(/PATTERN/) + +Splits something on a regular expresion + +=cut + +.sub 'split' :method :multi(_, 'Sub') +.param pmc regex +.local pmc match +.local pmc retv +.local int start_pos +.local int end_pos + +$S0 = self +retv = new 'List' +start_pos = 0 + +match = regex($S0) +if match goto loop +retv.'push'($S0) +goto done + + loop: +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +end_pos -= start_pos +$S1 = substr $S0, start_pos, end_pos +retv.'push'($S1) +unless match goto done +start_pos = match.'to'() +goto loop + + done: +.return(retv) +.end + =item substr() =cut
Patch for split(thing, delimiter) function
Moved the split function from Str.pir to any-str.pm and removed the Perl6Str coercion. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31254) +++ src/builtins/any-str.pir (working copy) @@ -172,6 +172,31 @@ .return ($P0) .end +=item split + + our List multi Str::split ( Str $delimiter , Str $input = $+_, Int $limit = inf ) + our List multi Str::split ( Rule $delimiter = /\s+/, Str $input = $+_, Int $limit = inf ) + our List multi Str::split ( Str $input : Str $delimiter , Int $limit = inf ) + our List multi Str::split ( Str $input : Rule $delimiter , Int $limit = inf ) + +String delimiters must not be treated as rules but as constants. The +default is no longer S<' '> since that would be interpreted as a constant. +P5's C<< split('S< >') >> will translate to C<.words> or some such. Null trailing fields +are no longer trimmed by default. We might add some kind of :trim flag or +introduce a trimlist function of some sort. + +B partial implementation only + +=cut + +.namespace[] +.sub 'split' :multi(_,_) +.param pmc sep +.param pmc target +.return target.'split'(sep) +.end + +.namespace['Any'] .sub 'split' :method :multi('String') .param string delim .local string objst @@ -202,12 +227,6 @@ .return(retv) .end -=item split(/PATTERN/) - -Splits something on a regular expresion - -=cut - .sub 'split' :method :multi(_, 'Sub') .param pmc regex .local pmc match Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31254) +++ src/classes/Str.pir (working copy) @@ -318,39 +318,6 @@ .return s.'capitalize'() .end - -=item split - - our List multi Str::split ( Str $delimiter , Str $input = $+_, Int $limit = inf ) - our List multi Str::split ( Rule $delimiter = /\s+/, Str $input = $+_, Int $limit = inf ) - our List multi Str::split ( Str $input : Str $delimiter , Int $limit = inf ) - our List multi Str::split ( Str $input : Rule $delimiter , Int $limit = inf ) - -String delimiters must not be treated as rules but as constants. The -default is no longer S<' '> since that would be interpreted as a constant. -P5's C<< split('S< >') >> will translate to C<.words> or some such. Null trailing fields -are no longer trimmed by default. We might add some kind of :trim flag or -introduce a trimlist function of some sort. - -B partial implementation only - -=cut - -.sub 'split' -.param string sep -.param string target -.local pmc a, b - -a = new 'Perl6Str' -b = new 'Perl6Str' - -a = target -b = sep - -.return a.'split'(b) -.end - - =item chop our Str method Str::chop ( Str $string: )
[perl #59074] Patch for split(thing, delimiter) function
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59074] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59074 > Moved the split function from Str.pir to any-str.pm and removed the Perl6Str coercion. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31254) +++ src/builtins/any-str.pir (working copy) @@ -172,6 +172,31 @@ .return ($P0) .end +=item split + + our List multi Str::split ( Str $delimiter , Str $input = $+_, Int $limit = inf ) + our List multi Str::split ( Rule $delimiter = /\s+/, Str $input = $+_, Int $limit = inf ) + our List multi Str::split ( Str $input : Str $delimiter , Int $limit = inf ) + our List multi Str::split ( Str $input : Rule $delimiter , Int $limit = inf ) + +String delimiters must not be treated as rules but as constants. The +default is no longer S<' '> since that would be interpreted as a constant. +P5's C<< split('S< >') >> will translate to C<.words> or some such. Null trailing fields +are no longer trimmed by default. We might add some kind of :trim flag or +introduce a trimlist function of some sort. + +B partial implementation only + +=cut + +.namespace[] +.sub 'split' :multi(_,_) +.param pmc sep +.param pmc target +.return target.'split'(sep) +.end + +.namespace['Any'] .sub 'split' :method :multi('String') .param string delim .local string objst @@ -202,12 +227,6 @@ .return(retv) .end -=item split(/PATTERN/) - -Splits something on a regular expresion - -=cut - .sub 'split' :method :multi(_, 'Sub') .param pmc regex .local pmc match Index: src/classes/Str.pir === --- src/classes/Str.pir (revision 31254) +++ src/classes/Str.pir (working copy) @@ -318,39 +318,6 @@ .return s.'capitalize'() .end - -=item split - - our List multi Str::split ( Str $delimiter , Str $input = $+_, Int $limit = inf ) - our List multi Str::split ( Rule $delimiter = /\s+/, Str $input = $+_, Int $limit = inf ) - our List multi Str::split ( Str $input : Str $delimiter , Int $limit = inf ) - our List multi Str::split ( Str $input : Rule $delimiter , Int $limit = inf ) - -String delimiters must not be treated as rules but as constants. The -default is no longer S<' '> since that would be interpreted as a constant. -P5's C<< split('S< >') >> will translate to C<.words> or some such. Null trailing fields -are no longer trimmed by default. We might add some kind of :trim flag or -introduce a trimlist function of some sort. - -B partial implementation only - -=cut - -.sub 'split' -.param string sep -.param string target -.local pmc a, b - -a = new 'Perl6Str' -b = new 'Perl6Str' - -a = target -b = sep - -.return a.'split'(b) -.end - - =item chop our Str method Str::chop ( Str $string: )
method signature issues
In any-str.pir we need to figure out how to change .sub 'split' :method :multi('String') into .sub 'split' :method :multi(_, 'String') since the former method signature is causing problems for me as I'm trying to implement .sub 'split' :method :multi(_, 'Sub') with an additional optional argument (the limit of how many elements to return in the list). While we still have the ".sub 'split' :method :multi('String')" method and I try to run some Perl 6 code that uses the newly added optional parameter to ".sub 'split' :method :multi(_, 'Sub')", I get the following error: say "theXbiggestXbangXforXtheXbuck".split(/X/, 3).perl; too many arguments passed (3) - 2 params expected current instr.: 'parrot;Any;split' pc 10833 (src/gen_builtins.pir:6926) Line 6926 of gen_builtins.pir is ".sub 'split' :method :multi('String')", not the expected method ".sub 'split' :method :multi(_, 'Sub')". Any when we change .sub 'split' :method :multi('String')" to " .sub 'split' :method :multi(_, 'String')" I can't even compile Perl 6. I get the following error: No applicable methods. current instr.: 'parrot;Perl6;Grammar;Actions;dec_number' pc 129924 (src/gen_actions.pir:11299) >From this point I'm not sure what's going on any help would be greatly appreciated. Best Regards, -Chris
method signature issues
If it is the case that :method and :multi are incompatible, I am a bit surprised to see that in the Rakudo src directory: $ grep -rHI ':method :multi' . | grep -v '.svn' | wc -l 94 On Sun, Sep 21, 2008 at 11:30 AM, chromatic <[EMAIL PROTECTED]> wrote: > On Friday 19 September 2008 21:46:53 Chris Davaz wrote: > >> In any-str.pir we need to figure out how to change >> >> .sub 'split' :method :multi('String') >> >> into >> >> .sub 'split' :method :multi(_, 'String') >> >> since the former method signature is causing problems for me as I'm >> trying to implement >> >> .sub 'split' :method :multi(_, 'Sub') >> >> with an additional optional argument (the limit of how many elements >> to return in the list). > > :method and :multi are fundamentally incompatible; the former implies single > dispatch while the latter explicitly expresses multiple dispatch. The PIR > compiler *could* unshift a type constraint onto the list of invocants > when :multi and :method appear together, but I'm not sure that produces clear > code. > > -- c >
Re: method signature issues
Patrick, Any thoughts on why I am getting the "No applicable methods" error as described in the head of this thread? On Sun, Sep 21, 2008 at 11:38 PM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote: > On Sat, Sep 20, 2008 at 11:05:34PM -0700, chromatic wrote: >> On Saturday 20 September 2008 22:24:52 Chris Davaz wrote: >> >> > If it is the case that :method and :multi are incompatible, I am a bit >> > surprised to see that in the Rakudo src directory: >> >> I said they're incompatible -- meant in terms of their semantics. I didn't >> say they don't work together in some cases with our current implementation. >> (They probably shouldn't.) > > :method and :multi work just fine together -- at least within a single > class. In fact, the various compilers in PCT (PAST::Compiler and > POST::Compiler) rely on multiple dispatch working properly for methods. > > Where the semantics get a little weird is when a subclass defines > a multimethod that overloads multimethods of the same name in the > parent class -- in this case the subclass multimethod completely > hides the parent class multimethod. > > Pm >
Re: method signature issues
Awesome Patrick, you totally nailed it ;-) I'll be submitting a patch soon. Do you know if there is a Parrot bug logged for the problem you described? On Mon, Sep 22, 2008 at 12:27 AM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote: > On Sat, Sep 20, 2008 at 12:46:53PM +0800, Chris Davaz wrote: >> In any-str.pir we need to figure out how to change >> .sub 'split' :method :multi('String') >> into >> .sub 'split' :method :multi(_, 'String') >> [...] > > ... let's back up a bit and look at what is really happening. > >> [...] Any when we change ".sub 'split' :method :multi('String')" >> to ".sub 'split' :method :multi(_, 'String')" I can't even compile >> Perl 6. I get the following error: >> No applicable methods. >> current instr.: 'parrot;Perl6;Grammar;Actions;dec_number' pc 129924 >> (src/gen_actions.pir:11299) > > This isn't precise -- perl6.pbc compiles just fine. What isn't > compiling is Test.pm, because the dec_number method in actions.pm > is using the 'split' builtin to get rid of underscores (added in r31225): > >method dec_number($/) { >my $num := ~$/; >$num := $num.split('_').join(''); >make PAST::Val.new( :value( $num ), :returns('Num'), :node( $/ ) ); >} > > Here we're calling a Perl 6 builtin function from NQP, and NQP > doesn't convert string constants (such as '_') into String PMCs > prior to calling the function -- it just generates a PIR method > call directly. Unfortunately, Parrot doesn't recognize a string > constant as being the same as a 'String' for MMD purposes, and > so it's unable to match the split method declared :multi(_, 'String'). > (This is arguably a Parrot bug, either in design or implementation.) > > The solution to the problem is to define the method to split > on strings as :multi(_, _) -- i.e.: > >.namespace ['Any'] >.sub 'split' :method :multi(_, _) > > This means this method will be used whenever there's not a more > specific multimethod available. Furthermore, this is actually > the correct semantics -- i.e., we expect the following to work > even though $x is an Int and not a Str: > >my $x = 9; >say 439123912.split($x).perl; # [ "43", "123", "12" ] > > So, try changing :multi('String') to :multi(_,_) and everything > should work just fine (and we should get a few more passing tests > to boot). > > Pm >
Some fixes to split methods
I have implemented the limit parameter on both Str.split(String, Integer) and Str.split(Regex, Integer). In doing so I had to change the method signature of Str.split(String) to ".sub 'split' :method :multi(_, _)" from ".sub 'split' :method :multi('String')". The former method signature is the correct one anyway as the latter just restricts the invoker to being a 'String' and doesn't say anything about the argument type. All the tests except for one in split.p6 "pass" (verify by just looking at the output). The one that doesn't pass is: say 102030405.split(0).perl; which produces: ["1.", "2", "3e+", "8"] Note that this bug is not introduced by the patch, it was only discovered because now we can do Int.split(Int). Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31332) +++ src/builtins/any-str.pir (working copy) @@ -197,8 +197,10 @@ .end .namespace['Any'] -.sub 'split' :method :multi('String') +.sub 'split' :method :multi(_, _) .param string delim +.param int count:optional +.param int has_count:opt_flag .local string objst .local pmc pieces .local pmc tmps @@ -214,6 +216,10 @@ len = pieces i = 0 loop: +unless has_count goto skip_count +dec count +if count < 0 goto done + skip_count: if i == len goto done tmps = new 'Perl6Str' @@ -229,6 +235,8 @@ .sub 'split' :method :multi(_, 'Sub') .param pmc regex +.param int count:optional +.param int has_count:opt_flag .local pmc match .local pmc retv .local int start_pos @@ -244,6 +252,10 @@ goto done loop: +unless has_count goto skip_count +dec count +if count < 0 goto done + skip_count: match = regex($S0, 'continue' => start_pos) end_pos = match.'from'() end_pos -= start_pos split.p6 Description: Binary data
[perl #59184] Some fixes to split methods
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59184] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59184 > I have implemented the limit parameter on both Str.split(String, Integer) and Str.split(Regex, Integer). In doing so I had to change the method signature of Str.split(String) to ".sub 'split' :method :multi(_, _)" from ".sub 'split' :method :multi('String')". The former method signature is the correct one anyway as the latter just restricts the invoker to being a 'String' and doesn't say anything about the argument type. All the tests except for one in split.p6 "pass" (verify by just looking at the output). The one that doesn't pass is: say 102030405.split(0).perl; which produces: ["1.", "2", "3e+", "8"] Note that this bug is not introduced by the patch, it was only discovered because now we can do Int.split(Int). Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31332) +++ src/builtins/any-str.pir (working copy) @@ -197,8 +197,10 @@ .end .namespace['Any'] -.sub 'split' :method :multi('String') +.sub 'split' :method :multi(_, _) .param string delim +.param int count:optional +.param int has_count:opt_flag .local string objst .local pmc pieces .local pmc tmps @@ -214,6 +216,10 @@ len = pieces i = 0 loop: +unless has_count goto skip_count +dec count +if count < 0 goto done + skip_count: if i == len goto done tmps = new 'Perl6Str' @@ -229,6 +235,8 @@ .sub 'split' :method :multi(_, 'Sub') .param pmc regex +.param int count:optional +.param int has_count:opt_flag .local pmc match .local pmc retv .local int start_pos @@ -244,6 +252,10 @@ goto done loop: +unless has_count goto skip_count +dec count +if count < 0 goto done + skip_count: match = regex($S0, 'continue' => start_pos) end_pos = match.'from'() end_pos -= start_pos split.p6 Description: Binary data
more fixes to split
please see http://rt.perl.org/rt3/Ticket/Display.html?id=59184 for more info and for the patch
[perl #59240] Automate publishing of docs/*
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59240] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59240 > I suggest we automate the publishing of everything under docs/* and putting it under parrotcode.org/docs in HTML format for easy access. This would probably help productivity and maybe even attract new developers. Just my two cents...
Re: [perl #59240] Automate publishing of docs/*
Ahh, cool I didn't even know we had parrot.org. Publishing docs/book/* would be nice. On Tue, Sep 23, 2008 at 10:27 PM, Will Coleda via RT <[EMAIL PROTECTED]> wrote: > On Tue, Sep 23, 2008 at 9:04 AM, via RT Chris Davaz > <[EMAIL PROTECTED]> wrote: >> # New Ticket Created by "Chris Davaz" >> # Please include the string: [perl #59240] >> # in the subject line of all future correspondence about this issue. >> # http://rt.perl.org/rt3/Ticket/Display.html?id=59240 > >> >> >> I suggest we automate the publishing of everything under docs/* and >> putting it under parrotcode.org/docs in HTML format for easy access. >> This would probably help productivity and maybe even attract new >> developers. >> >> Just my two cents... >> > > we're moving to http://www.parrot.org/ ; I think Allison has said > we're going to automate this publishing process. > > In the meantime, most of the docs on parrotcode.org update > automatically once a link is put up; we can, in the meantime, add some > more if there are some particular docs you'd like to see. > > > -- > Will "Coke" Coleda > > >
Re: Split with negative limits, and other weirdnesses
If someone wants to make the final word on what the behavior should be I can go ahead and implement it. On Tue, Sep 23, 2008 at 11:41 PM, Jonathan Scott Duff <[EMAIL PROTECTED]> wrote: > On Tue, Sep 23, 2008 at 9:38 AM, TSa <[EMAIL PROTECTED]> wrote: > >> HaloO, >> Moritz Lenz wrote: >> >>> In Perl 5 a negative limit means "unlimited", which we don't have to do >>> because we have the Whatever star. >>> >> >> I like the notion of negative numbers as the other end of infinity. >> Where infinity here is the length of the split list which can be >> infinite if split is called on a file handle. So a negative number >> could be the number of splits to skip from the front of the list. >> And limits of the form '*-5' would deliver the five last splits. >> > > As another data point, this is the first thing I thought of when I read the > email regarding negative limits. But then I thought "we're trying to get > away from so much implicit magic". And I'm not sure the failure mode is loud > enough when the skip-from-the-front semantics /aren't/ what you want (e.g., > when the limit parameter is variable-ish) > > > A limit of 0 is basically ignored. >> >> Here are a few solution I could think of >> 1) A limit of 0 returns the empty list (you want zero items, you get them) >> > > I think this is a nice degenerate case. >> > > Me too. > > > 2) A limit of 0 fail()s >> > > This is a bit too drastic. > > > Indeed. > > > > 3) non-positive $limit arguments are rejected by the signature (Int >> where { $_ > 0 }) >> > > I think that documents and enforces the common case best. But I would >> include zero and use a name like UInt that has other uses as well. Are >> there pragmas that turn signature failures into undef return values? >> >> >> Regards, TSa. >> -- >> >> "The unavoidable price of reliability is simplicity" -- C.A.R. Hoare >> "Simplicity does not precede complexity, but follows it." -- A.J. Perlis >> 1 + 2 + 3 + 4 + ... = -1/12 -- Srinivasa Ramanujan >> > > > my two cents, > > -Scott > > -- > Jonathan Scott Duff > [EMAIL PROTECTED] >
Re: [perl #59184] Some fixes to split methods
Nope, that last one was it. Still waiting on a decision for how edge cases on limit are to be handled. On Thu, Sep 25, 2008 at 10:49 PM, Moritz Lenz via RT <[EMAIL PROTECTED]> wrote: > On Mon Sep 22 22:55:29 2008, cdavaz wrote: >> Grr.. wrong again sorry!! Forgot to remove the handle_count label. >> Please let me know if you see any problems. > > Sorry, I've lost track of your patches (probably due to our parallel > mailing list/RT/IRC conversations); are there still split() patches open > which should be applied? >
[perl #59366] small fix to pod doc and interactive prompt
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59366] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59366 > Fixed a bug in the doc where the method name and doc where mismatched. Fixed a small bug where, even if the user sets the prompt, a default prompt '> ' is still printed. Changed it so that a default prompt is only printed only if the user had not set a prompt (or set an empty prompt). Index: compilers/pct/src/PCT/HLLCompiler.pir === --- compilers/pct/src/PCT/HLLCompiler.pir (revision 31432) +++ compilers/pct/src/PCT/HLLCompiler.pir (working copy) @@ -135,16 +135,16 @@ =item commandline_banner([string value]) +Set the string in $S0 as a commandline banner on the compiler in $P0. The +banner is the first text that is shown when the comâ piler is started in +interactive mode. This can be used for a copyright notice or other information. + +=item commandline_prompt([string value]) + Set the string in $S0 as a commandline prompt on the compiler in $P0. The prompt is the text that is shown on the commandline before a command is entered when the compiler is started in interactive mode. -=item commandline_prompt([string value]) - -Set the string in $S0 as a commandline banner on the compiler in $P0. The -banner is the first text that is shown when the comâ piler is started in -interactive mode. This can be used for a copyright notice or other information. - =cut .sub 'stages' :method @@ -511,6 +511,7 @@ if encoding == 'fixed_8' goto interactive_loop unless encoding goto interactive_loop push stdin, encoding + interactive_loop: .local pmc code unless stdin goto interactive_end @@ -518,11 +519,14 @@ ## libraries aren't present (RT #41103) # for each input line, print the prompt -$P0 = self.'commandline_prompt'() -printerr $P0 +$S3 = self.'commandline_prompt'() +ne_str $S3, '', prompt_set +$S3 = '> ' + prompt_set: +printerr $S3 if has_readline < 0 goto no_readline -code = stdin.'readline'('> ') +code = stdin.'readline'('') if null code goto interactive_end concat code, "\n" goto have_code Index: config/gen/languages.pm === --- config/gen/languages.pm (revision 31432) +++ config/gen/languages.pm (working copy) @@ -54,6 +54,7 @@ unlambda urm WMLScript Zcode +monkey }; $data{languages_source} = q{config/gen/makefiles/languages.in}; return \%data;
Re: Split with negative limits, and other weirdnesses
Ok, so 0 returns the empty list and -1 violates the signature? In PIR can we have such signatures that put a constraint on the range of values for a given parameter? On Sun, Sep 28, 2008 at 7:25 PM, Carl Mäsak <[EMAIL PROTECTED]> wrote: > Jason (>): >> It makes sense to me to go with option 1; you get what you ask for. It also >> makes sense to make to not use magical implied numbers, such as negatives, >> to accomplish things that either ranges or whatever star can accomplish. > > Aye, agreement. There's a whole lot of consensus already... reading > through the discussion once more, I don't find anyone saying anything > contradicting the above summary. > > Chris, I'm not in a position to provide a final word, but it seems > very possible already to use what has already been said here as a > basis for an implementation. > > // Carl >
[perl #59642] Return the empty list on non-positive LIMIT
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #59642] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=59642 > Here is a small patch to make split return the empty list on non-positive limit arguments instead of returning the string split entirely. Moritz, I think it was you who suggested that negative arguments be 'rejected by the signature' but I am not sure how to do this in PIR. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 31687) +++ src/builtins/any-str.pir (working copy) @@ -411,8 +411,7 @@ # per Perl 5's negative LIMIT behavior unless has_count goto positive_count -unless count < 1 goto positive_count -has_count = 0 +if count < 1 goto done positive_count: match = regex($S0)
[perl #60228] fix for split on zero-width
# New Ticket Created by "Chris Davaz" # Please include the string: [perl #60228] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=60228 > This is a fix for splitting strings on regular expressions that contain zero-width matches. All the tests in split-simple.t pass. Index: src/builtins/any-str.pir === --- src/builtins/any-str.pir (revision 32228) +++ src/builtins/any-str.pir (working copy) @@ -428,6 +428,7 @@ .local pmc retv .local int start_pos .local int end_pos +.local int zwm_start $S0 = self retv = new 'List' @@ -450,10 +451,23 @@ $S1 = substr $S0, start_pos retv.'push'($S1) goto done + next_zwm: +zwm_start = start_pos + inc_zwm: +inc start_pos +match = regex($S0, 'continue' => start_pos) +end_pos = match.'from'() +unless start_pos == end_pos goto inc_zwm +start_pos = zwm_start +end_pos -= start_pos +goto add_str skip_count: match = regex($S0, 'continue' => start_pos) end_pos = match.'from'() +$I99 = match.'to'() +if $I99 == end_pos goto next_zwm end_pos -= start_pos + add_str: $S1 = substr $S0, start_pos, end_pos retv.'push'($S1) unless match goto done