[perl #58952] Implemented second argument to comb

2008-09-17 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #58952]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=58952 >


Here is the implementation of the second argument to the comb method as
described in S29. Here is a test I wrote (which I'll commit to the pugs test
suite later):

my Str $hair = "Th3r3 4r3 s0m3 numb3rs 1n th1s str1ng";
say $hair.comb(/\d+/);
say $hair.comb(/\d+/, -10);
say $hair.comb(/\d+/, 0);
say $hair.comb(/\d+/, 1);
say $hair.comb(/\d+/, 3);
say $hair.comb(/\d+/, 9);
say $hair.comb(/\d+/, 1);

The output being:

3343033111


3
334
334303311
3343033111
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31199)
+++ src/builtins/any-str.pir	(working copy)
@@ -45,6 +45,8 @@
 
 .sub comb :method :multi(_)
 .param pmc regex
+.param int count:optional
+.param int has_count:opt_flag
 .local pmc retv, match
 .local string s
 
@@ -54,6 +56,10 @@
   do_match:
 match = regex.'ACCEPTS'(s)
 unless match goto done
+unless has_count goto skip_count
+count -= 1
+if count < 0 goto done
+  skip_count:
 # shouldn't have to coerce to Str here, but see RT #55962
 $S0 = match
 retv.'push'($S0)


Questions about :multi in Method Signatures

2008-09-17 Thread Chris Davaz
I am confused about how we should setup method signatures:

Let's take a look at a line in any-str.pir:

 46 .sub 'comb' :method :multi(_)
 47 .param pmc regex
 48 .param int count:optional
 49 .param int has_count:opt_flag

As you can see we have one parameter specified in :multi which is _ (any
type). However we also have two .param lines, 47 and 48. So here are some
questions:

I noticed 'self' is implicitly defined, however does 'self' eat up a
parameter? Should we always have one parameter in :multi specified for the
object the method is running on?

Also, I played around with :multi by putting in different things. With the
above method here is what I tried and the result

:multi(_) - works
:multi(_, _) works
:multi(_,_,_) doesn't work
:multi(_,Integer) doesn't work
:mult(Sub) doesn't work

Some clarification surrounding the use of :multi would help a lot.

Best Regards,
-Chris Davaz


Re: Questions about :multi in Method Signatures

2008-09-17 Thread Chris Davaz
That's a great response, thanks. Clears things up. One question, should be
always be using _ for the invocant or should we try to restrict it?

On Wed, Sep 17, 2008 at 10:52 PM, Patrick R. Michaud <[EMAIL PROTECTED]>wrote:

> On Wed, Sep 17, 2008 at 08:37:36PM +0800, Chris Davaz wrote:
> > I am confused about how we should setup method signatures:
> >
> > Let's take a look at a line in any-str.pir:
> >
> >  46 .sub 'comb' :method :multi(_)
> >  47 .param pmc regex
> >  48 .param int count:optional
> >  49 .param int has_count:opt_flag
> >
> > As you can see we have one parameter specified in :multi which is _ (any
> > type). However we also have two .param lines, 47 and 48. So here are some
> > questions:
> >
> > I noticed 'self' is implicitly defined, however does 'self' eat up a
> > parameter? Should we always have one parameter in :multi specified for
> the
> > object the method is running on?
>
> Yes, the first argument of the :multi refers to the invocant.
>
> > Also, I played around with :multi by putting in different things. With
> the
> > above method here is what I tried and the result
> >
> > :multi(_) - works
>
> Restricts the sub to being invoked by callers supplying
> at least one argument (in this case, the argument is the
> invocant, since the sub is declared as :method).
>
> > :multi(_, _) works
>
> This says that the sub requires at least two arguments (of any
> type).  The first will go into 'self', the second into 'regex'.
>
> > :multi(_,_,_) doesn't work
>
> This says that the sub requires at least three arguments (of
> any type).  This doesn't really match the sub definition though,
> which has an invocant, a required argument, and an optional argument.
>
> > :multi(_,Integer) doesn't work
>
> This says the invocant may be of any type, and the first argument
> must be an Integer (or a subclass of Integer).  Probably not what
> we want given that the first parameter of the sub is 'regex'.
>
>
> > :mult(Sub) doesn't work
>
> This says that the sub can be invoked only on Sub invocants.
>
> > Some clarification surrounding the use of :multi would help a lot.
>
> Hope the above helps.  I don't know where :multi is documented in
> Parrot itself; the pdd27 file doesn't provide much detail.
>
> Pm
>


[perl #58970] Initial implementation of Str.split(Regex)

2008-09-17 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #58970]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=58970 >


I say "initial" because it didn't pass one of my tests. This might be due to
regular expressions not being fully implemented, so if this is the case
please let me know. Once I know I'll fix my code if need by and write an
appropriate test case and add it to the pugs repo.

my Str $test = "theXbiggestXbangXforXtheXbuck";
my List $list = $test.split(/X/);
print $list.join("\n");
$test = "tagstripping";
# Expect the print to give us "tag stripping" however it just yields some
whitespace, not sure why
$list = $test.split(/\<\/?.*?\>/);
print $list.join(" ");
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31205)
+++ src/classes/Str.pir	(working copy)
@@ -46,7 +46,7 @@
 .return(retv)
 .end
 
-.sub 'split' :method :multi('String')
+.sub 'split' :method :multi(_, 'String')
 .param string delim
 .local string objst
 .local pmc pieces
@@ -76,6 +76,50 @@
 .return(retv)
 .end
 
+# split a string on a regex
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc tmpstr
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+retv = new 'List'
+
+match = regex.'ACCEPTS'(self)
+unless match goto done
+
+start_pos = 0
+end_pos = match.'from'()
+
+  loop:
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+start_pos = match.'to'()
+
+match.'next'()
+
+end_pos = match.'from'()
+end_pos -= start_pos
+
+$S1 = match.'text'()
+if $S1 == '' goto last
+goto loop
+
+  last:
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+  done:
+.return(retv)
+.end
+
 .sub lc :method
 .local string tmps
 .local pmc retv


[perl #58970] Initial implementation of Str.split(Regex)

2008-09-17 Thread Chris Davaz
Ok, here it is without the change to "split on a string", and the test
passes. Please apply this one and in the meantime I will see how we can get
the method signature right for split on a strong + not break reverse.


On Thu, Sep 18, 2008 at 1:57 AM, Moritz Lenz <[EMAIL PROTECTED]>wrote:

> Chris Davaz (via RT) wrote:
> > # New Ticket Created by  "Chris Davaz"
> > # Please include the string:  [perl #58970]
> > # in the subject line of all future correspondence about this issue.
> > # http://rt.perl.org/rt3/Ticket/Display.html?id=58970 >
> >
> >
> > I say "initial" because it didn't pass one of my tests.
>
> More importantly it introduces a regression in t/spec/S29-list/reverse.t
> (Str.reverse is implemented in terms of split-into-characters -> list
> reverse -> join; the split failed with your patch, thus causing a test
> failure in 'make spectest_regression')
>
> > This might be due to
> > regular expressions not being fully implemented, so if this is the case
> > please let me know.
>
> That's unlikely (but not impossible, of course), the regular rexpression
> engine is quite good and well-tested. I'll take a look into it later.
>
> Anyway, nice patch, and if Str.reverse is fixed it'll certainly be applied.
>
> Moritz
>
> --
> Moritz Lenz
> http://moritz.faui2k3.org/ |  http://perl-6.de/
>
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31220)
+++ src/classes/Str.pir	(working copy)
@@ -76,6 +76,50 @@
 .return(retv)
 .end
 
+# split a string on a regex
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc tmpstr
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+retv = new 'List'
+
+match = regex.'ACCEPTS'(self)
+unless match goto done
+
+start_pos = 0
+end_pos = match.'from'()
+
+  loop:
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+start_pos = match.'to'()
+
+match.'next'()
+
+end_pos = match.'from'()
+end_pos -= start_pos
+
+$S1 = match.'text'()
+if $S1 == '' goto last
+goto loop
+
+  last:
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+  done:
+.return(retv)
+.end
+
 .sub lc :method
 .local string tmps
 .local pmc retv


Re: [perl #58970] Initial implementation of Str.split(Regex)

2008-09-18 Thread Chris Davaz
hmm I see I'll work it out ;-) Thanks

On Thu, Sep 18, 2008 at 3:33 PM, Moritz Lenz <[EMAIL PROTECTED]>wrote:
>
>> Chris Davaz wrote:
>> > Ok, here it is without the change to "split on a string", and the test
>> > passes.
>>
>> Yes, but on my machine now t/spec/S04-statements/gather.t produces a
>> segmentation fault. I'll have to investigate if that is related to your
>> patch or not.
>>
>> Also split behaves a bit strangely here:
>>
>>  > say 'ab23d4f5'.split(/\d+/).perl
>>  ["ab", "", "", "d", "f", ""]
>>  > say 'ab23d4f5'.split(/\d/).perl
>>  ["ab", "", "d", "f", ""]
>>
>> (There are a few other oddities like the behaviour with a zero-width
>> match, but that's only a minor issue).
>>
>> > Please apply this one and in the meantime I will see how we can
>> > get the method signature right for split on a strong + not break
>> reverse.
>>
>> --
>> Moritz Lenz
>> http://moritz.faui2k3.org/ |  http://perl-6.de/
>>
>
>


Expected behavior of match?

2008-09-18 Thread Chris Davaz
The attached split.diff file is just for demonstration, not a patch
submittal.

I made a method on Str called "match" that returns a List of all matches:

# returns all matches on a regex
.sub 'match' :method :multi(_, 'Sub')
.param pmc regex
.local pmc match
.local pmc tmpstr
.local pmc retv

retv = new 'List'
match = regex.'ACCEPTS'(self)

loop:
  $S0 = match.'text'()
  if $S0 == '' goto done
  retv.'push'($S0)
  match.'next'()
  goto loop

done:
  .return(retv)
.end

However, running the following code:
say "ab1cd12ef123gh".match(/\d+/).perl;

We get:
["1", "12", "1", "2", "123", "12", "1", "23", "2", "3"]

As you can see match is returning what seems to be both greedy *and*
non-greedy matches (and everything in between). Is this correct?
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31220)
+++ src/classes/Str.pir	(working copy)
@@ -76,6 +76,78 @@
 .return(retv)
 .end
 
+# returns all matches on a regex
+.sub 'match' :method :multi(_, 'Sub')
+.param pmc regex
+.local pmc match
+.local pmc tmpstr
+.local pmc retv
+
+retv = new 'List'
+match = regex.'ACCEPTS'(self)
+
+loop:
+  $S0 = match.'text'()
+  if $S0 == '' goto done
+  retv.'push'($S0)
+  match.'next'()
+  goto loop
+
+done:
+  .return(retv)
+.end
+
+# split a string on a regex
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc tmpstr
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+retv = new 'List'
+
+match = regex.'ACCEPTS'(self)
+unless match goto done
+
+start_pos = 0
+end_pos = match.'from'()
+
+  loop:
+print "start_pos = "
+print start_pos
+print "; end_pos = "
+say end_pos
+$S7 = match.'text'()
+say $S7
+
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+start_pos = match.'to'()
+
+match.'next'()
+
+end_pos = match.'from'()
+end_pos -= start_pos
+
+$S1 = match.'text'()
+if $S1 == '' goto last
+goto loop
+
+  last:
+tmpstr = new 'Perl6Str'
+$S0 = substr self, start_pos, end_pos
+tmpstr = $S0
+retv.'push'(tmpstr)
+
+  done:
+.return(retv)
+.end
+
 .sub lc :method
 .local string tmps
 .local pmc retv


split.p6
Description: Binary data


S05 and S29 may conflict on behavior of $string.match(/pat/)

2008-09-18 Thread Chris Davaz
I'm trying to pin down what $string.match(/pat/) should be returning.

>From S05:

Under "Return values from match objects"
"A match always returns a Match object..."

>From S29:

Under the definition of Str.comb

Saying

$string.comb(/pat/, $n)

is equivalent to

$string.match(rx:global:x(0..$n):c/pat/)

[ ...and later... ]

"If there are captures in the pattern, a list of Match objects (one
per match) is returned instead of strings."

Which implies that $string.match(/pat/) should indeed return a List of
Str and $string.match(/pat_with_groups/) should return a List of
Match.

I expected the S29 definition when first approaching $string.match I
feel it is more intuitive than what happens with S05. Could someone
clarify what the behavior should be?

Best Regards,
-Chris Davaz


Re: [perl #58970] Initial implementation of Str.split(Regex)

2008-09-18 Thread Chris Davaz
Here is the new patch for split on a regex. Testing it with:

say "theXbiggestXbangXforXtheXbuck".split(/X/).perl;
say "ab1cd12ef123gh".split(/\d+/).perl;
say "charsoup".split(/\<\/?.*?\>/).perl;

We get the following output:

["the", "biggest", "bang", "for", "the", "buck"]
["ab", "cd", "ef", "gh"]
["", "char", "", "soup", ""]

I'll upload a test to pugs later.

On Thu, Sep 18, 2008 at 3:33 PM, Moritz Lenz
<[EMAIL PROTECTED]> wrote:
> Chris Davaz wrote:
>> Ok, here it is without the change to "split on a string", and the test
>> passes.
>
> Yes, but on my machine now t/spec/S04-statements/gather.t produces a
> segmentation fault. I'll have to investigate if that is related to your
> patch or not.
>
> Also split behaves a bit strangely here:
>
>  > say 'ab23d4f5'.split(/\d+/).perl
>  ["ab", "", "", "d", "f", ""]
>  > say 'ab23d4f5'.split(/\d/).perl
>  ["ab", "", "d", "f", ""]
>
> (There are a few other oddities like the behaviour with a zero-width
> match, but that's only a minor issue).
>
>> Please apply this one and in the meantime I will see how we can
>> get the method signature right for split on a strong + not break reverse.
>
> --
> Moritz Lenz
> http://moritz.faui2k3.org/ |  http://perl-6.de/
>
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31220)
+++ src/classes/Str.pir	(working copy)
@@ -76,6 +76,35 @@
 .return(retv)
 .end
 
+# split a string on a regex
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc tmpstr
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+$S0 = self
+retv = new 'List'
+start_pos = 0
+
+  loop:
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+end_pos -= start_pos
+tmpstr = new 'Perl6Str'
+$S1 = substr $S0, start_pos, end_pos
+tmpstr = $S1
+retv.'push'(tmpstr)
+unless match goto done
+start_pos = match.'to'()
+goto loop
+
+  done:
+.return(retv)
+.end
+
 .sub lc :method
 .local string tmps
 .local pmc retv


Made some fixes to split on a regex and moved from Str.pir to any-str.pir

2008-09-18 Thread Chris Davaz
Got rid of "tempstr" and now returns the entire string on a non-match.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31220)
+++ src/builtins/any-str.pir	(working copy)
@@ -71,7 +71,42 @@
 .return(retv)
 .end
 
+=item split()
 
+Splits something on a regular expresion
+
+=cut
+
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+$S0 = self
+retv = new 'List'
+start_pos = 0
+
+match = regex($S0)
+if match goto loop
+retv.'push'($S0)
+goto done
+
+  loop:
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+end_pos -= start_pos
+$S1 = substr $S0, start_pos, end_pos
+retv.'push'($S1)
+unless match goto done
+start_pos = match.'to'()
+goto loop
+
+  done:
+.return(retv)
+.end
+
 =item index()
 
 =cut


split.p6
Description: Binary data


Re: Made some fixes to split on a regex and moved from Str.pir to any-str.pir

2008-09-18 Thread Chris Davaz
Sorry forgot to put the method in alphabetical order, here you go.

On Fri, Sep 19, 2008 at 12:36 AM, Chris Davaz <[EMAIL PROTECTED]> wrote:
> Got rid of "tempstr" and now returns the entire string on a non-match.
>
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31220)
+++ src/builtins/any-str.pir	(working copy)
@@ -71,7 +71,6 @@
 .return(retv)
 .end
 
-
 =item index()
 
 =cut
@@ -173,6 +172,42 @@
 .return ($P0)
 .end
 
+=item split(/PATTERN/)
+
+Splits something on a regular expresion
+
+=cut
+
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+$S0 = self
+retv = new 'List'
+start_pos = 0
+
+match = regex($S0)
+if match goto loop
+retv.'push'($S0)
+goto done
+
+  loop:
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+end_pos -= start_pos
+$S1 = substr $S0, start_pos, end_pos
+retv.'push'($S1)
+unless match goto done
+start_pos = match.'to'()
+goto loop
+
+  done:
+.return(retv)
+.end
+
 =item substr()
 
 =cut


Re: S05 and S29 may conflict on behavior of $string.match(/pat/)

2008-09-18 Thread Chris Davaz
Thanks for clarifying however I'm still unsure what a Perl 6 user
should expect to get back from running $string.match(/pat/). This is
the ""one
high-level call to the .match method" yes? So it should be returning a
List of Str (or List of Match in case of capture groups), is this
correct? I ask because in the current Rakudo implementation it returns
the Match object (what I would expect from the "one low-level run of
the regex engine").

Best Regards,
-Chris

On Thu, Sep 18, 2008 at 11:52 PM, Larry Wall <[EMAIL PROTECTED]> wrote:
> On Thu, Sep 18, 2008 at 06:11:45PM +0800, Chris Davaz wrote:
> : I'm trying to pin down what $string.match(/pat/) should be returning.
> :
> : >From S05:
> :
> : Under "Return values from match objects"
> : "A match always returns a Match object..."
> :
> : >From S29:
> :
> : Under the definition of Str.comb
> :
> : Saying
> :
> : $string.comb(/pat/, $n)
> :
> : is equivalent to
> :
> : $string.match(rx:global:x(0..$n):c/pat/)
> :
> : [ ...and later... ]
> :
> : "If there are captures in the pattern, a list of Match objects (one
> : per match) is returned instead of strings."
> :
> : Which implies that $string.match(/pat/) should indeed return a List of
> : Str and $string.match(/pat_with_groups/) should return a List of
> : Match.
> :
> : I expected the S29 definition when first approaching $string.match I
> : feel it is more intuitive than what happens with S05. Could someone
> : clarify what the behavior should be?
>
> S05 is using a different definition of "match".  In S05 it means
> more like "one low-level run of the regex engine" rather than "one
> high-level call to the .match method".  In other words, the .match
> method can do multiple matches.
>
> Larry
>


[perl #59014] Made some fixes to split on a regex and moved from Str.pir to any-str.pir

2008-09-19 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59014]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59014 >


Got rid of "tempstr" and now returns the entire string on a non-match.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31220)
+++ src/builtins/any-str.pir	(working copy)
@@ -71,7 +71,42 @@
 .return(retv)
 .end
 
+=item split()
 
+Splits something on a regular expresion
+
+=cut
+
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+$S0 = self
+retv = new 'List'
+start_pos = 0
+
+match = regex($S0)
+if match goto loop
+retv.'push'($S0)
+goto done
+
+  loop:
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+end_pos -= start_pos
+$S1 = substr $S0, start_pos, end_pos
+retv.'push'($S1)
+unless match goto done
+start_pos = match.'to'()
+goto loop
+
+  done:
+.return(retv)
+.end
+
 =item index()
 
 =cut


split.p6
Description: Binary data


[perl #59016] Re: Made some fixes to split on a regex and moved from Str.pir to any-str.pir

2008-09-19 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59016]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59016 >


Sorry forgot to put the method in alphabetical order, here you go.

On Fri, Sep 19, 2008 at 12:36 AM, Chris Davaz <[EMAIL PROTECTED]> wrote:
> Got rid of "tempstr" and now returns the entire string on a non-match.
>
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31220)
+++ src/builtins/any-str.pir	(working copy)
@@ -71,7 +71,6 @@
 .return(retv)
 .end
 
-
 =item index()
 
 =cut
@@ -173,6 +172,42 @@
 .return ($P0)
 .end
 
+=item split(/PATTERN/)
+
+Splits something on a regular expresion
+
+=cut
+
+.sub 'split' :method :multi(_, 'Sub')
+.param pmc regex 
+.local pmc match
+.local pmc retv
+.local int start_pos
+.local int end_pos
+
+$S0 = self
+retv = new 'List'
+start_pos = 0
+
+match = regex($S0)
+if match goto loop
+retv.'push'($S0)
+goto done
+
+  loop:
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+end_pos -= start_pos
+$S1 = substr $S0, start_pos, end_pos
+retv.'push'($S1)
+unless match goto done
+start_pos = match.'to'()
+goto loop
+
+  done:
+.return(retv)
+.end
+
 =item substr()
 
 =cut


Patch for split(thing, delimiter) function

2008-09-19 Thread Chris Davaz
Moved the split function from Str.pir to any-str.pm and removed the
Perl6Str coercion.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31254)
+++ src/builtins/any-str.pir	(working copy)
@@ -172,6 +172,31 @@
 .return ($P0)
 .end
 
+=item split
+
+ our List multi Str::split ( Str $delimiter ,  Str $input = $+_, Int $limit = inf )
+ our List multi Str::split ( Rule $delimiter = /\s+/,  Str $input = $+_, Int $limit = inf )
+ our List multi Str::split ( Str $input :  Str $delimiter  , Int $limit = inf )
+ our List multi Str::split ( Str $input : Rule $delimiter  , Int $limit = inf )
+
+String delimiters must not be treated as rules but as constants.  The
+default is no longer S<' '> since that would be interpreted as a constant.
+P5's C<< split('S< >') >> will translate to C<.words> or some such.  Null trailing fields
+are no longer trimmed by default.  We might add some kind of :trim flag or
+introduce a trimlist function of some sort.
+
+B partial implementation only
+
+=cut
+
+.namespace[]
+.sub 'split' :multi(_,_)
+.param pmc sep
+.param pmc target
+.return target.'split'(sep)
+.end
+
+.namespace['Any']
 .sub 'split' :method :multi('String')
 .param string delim
 .local string objst
@@ -202,12 +227,6 @@
 .return(retv)
 .end
 
-=item split(/PATTERN/)
-
-Splits something on a regular expresion
-
-=cut
-
 .sub 'split' :method :multi(_, 'Sub')
 .param pmc regex
 .local pmc match
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31254)
+++ src/classes/Str.pir	(working copy)
@@ -318,39 +318,6 @@
 .return s.'capitalize'()
 .end
 
-
-=item split
-
- our List multi Str::split ( Str $delimiter ,  Str $input = $+_, Int $limit = inf )
- our List multi Str::split ( Rule $delimiter = /\s+/,  Str $input = $+_, Int $limit = inf )
- our List multi Str::split ( Str $input :  Str $delimiter  , Int $limit = inf )
- our List multi Str::split ( Str $input : Rule $delimiter  , Int $limit = inf )
-
-String delimiters must not be treated as rules but as constants.  The
-default is no longer S<' '> since that would be interpreted as a constant.
-P5's C<< split('S< >') >> will translate to C<.words> or some such.  Null trailing fields
-are no longer trimmed by default.  We might add some kind of :trim flag or
-introduce a trimlist function of some sort.
-
-B partial implementation only
-
-=cut
-
-.sub 'split'
-.param string sep
-.param string target
-.local pmc a, b
-
-a = new 'Perl6Str'
-b = new 'Perl6Str'
-
-a = target
-b = sep
-
-.return a.'split'(b)
-.end
-
-
 =item chop
 
  our Str method Str::chop ( Str  $string: )


[perl #59074] Patch for split(thing, delimiter) function

2008-09-19 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59074]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59074 >


Moved the split function from Str.pir to any-str.pm and removed the
Perl6Str coercion.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31254)
+++ src/builtins/any-str.pir	(working copy)
@@ -172,6 +172,31 @@
 .return ($P0)
 .end
 
+=item split
+
+ our List multi Str::split ( Str $delimiter ,  Str $input = $+_, Int $limit = inf )
+ our List multi Str::split ( Rule $delimiter = /\s+/,  Str $input = $+_, Int $limit = inf )
+ our List multi Str::split ( Str $input :  Str $delimiter  , Int $limit = inf )
+ our List multi Str::split ( Str $input : Rule $delimiter  , Int $limit = inf )
+
+String delimiters must not be treated as rules but as constants.  The
+default is no longer S<' '> since that would be interpreted as a constant.
+P5's C<< split('S< >') >> will translate to C<.words> or some such.  Null trailing fields
+are no longer trimmed by default.  We might add some kind of :trim flag or
+introduce a trimlist function of some sort.
+
+B partial implementation only
+
+=cut
+
+.namespace[]
+.sub 'split' :multi(_,_)
+.param pmc sep
+.param pmc target
+.return target.'split'(sep)
+.end
+
+.namespace['Any']
 .sub 'split' :method :multi('String')
 .param string delim
 .local string objst
@@ -202,12 +227,6 @@
 .return(retv)
 .end
 
-=item split(/PATTERN/)
-
-Splits something on a regular expresion
-
-=cut
-
 .sub 'split' :method :multi(_, 'Sub')
 .param pmc regex
 .local pmc match
Index: src/classes/Str.pir
===
--- src/classes/Str.pir	(revision 31254)
+++ src/classes/Str.pir	(working copy)
@@ -318,39 +318,6 @@
 .return s.'capitalize'()
 .end
 
-
-=item split
-
- our List multi Str::split ( Str $delimiter ,  Str $input = $+_, Int $limit = inf )
- our List multi Str::split ( Rule $delimiter = /\s+/,  Str $input = $+_, Int $limit = inf )
- our List multi Str::split ( Str $input :  Str $delimiter  , Int $limit = inf )
- our List multi Str::split ( Str $input : Rule $delimiter  , Int $limit = inf )
-
-String delimiters must not be treated as rules but as constants.  The
-default is no longer S<' '> since that would be interpreted as a constant.
-P5's C<< split('S< >') >> will translate to C<.words> or some such.  Null trailing fields
-are no longer trimmed by default.  We might add some kind of :trim flag or
-introduce a trimlist function of some sort.
-
-B partial implementation only
-
-=cut
-
-.sub 'split'
-.param string sep
-.param string target
-.local pmc a, b
-
-a = new 'Perl6Str'
-b = new 'Perl6Str'
-
-a = target
-b = sep
-
-.return a.'split'(b)
-.end
-
-
 =item chop
 
  our Str method Str::chop ( Str  $string: )


method signature issues

2008-09-19 Thread Chris Davaz
In any-str.pir we need to figure out how to change

.sub 'split' :method :multi('String')

into

.sub 'split' :method :multi(_, 'String')

since the former method signature is causing problems for me as I'm
trying to implement

.sub 'split' :method :multi(_, 'Sub')

with an additional optional argument (the limit of how many elements
to return in the list). While we still have the ".sub 'split' :method
:multi('String')" method and I try to run some Perl 6 code that uses
the newly added optional parameter to ".sub 'split' :method :multi(_,
'Sub')", I get the following error:

say "theXbiggestXbangXforXtheXbuck".split(/X/, 3).perl;
too many arguments passed (3) - 2 params expected
current instr.: 'parrot;Any;split' pc 10833 (src/gen_builtins.pir:6926)

Line 6926 of gen_builtins.pir is ".sub 'split' :method
:multi('String')", not the expected method ".sub 'split' :method
:multi(_, 'Sub')". Any when we change .sub 'split' :method
:multi('String')" to "
.sub 'split' :method :multi(_, 'String')" I can't even compile Perl 6.
I get the following error:

No applicable methods.

current instr.: 'parrot;Perl6;Grammar;Actions;dec_number' pc 129924
(src/gen_actions.pir:11299)

>From this point I'm not sure what's going on any help would be
greatly appreciated.

Best Regards,
-Chris


method signature issues

2008-09-20 Thread Chris Davaz
If it is the case that :method and :multi are incompatible, I am a bit
surprised to see that in the Rakudo src directory:

$ grep -rHI ':method :multi' . | grep -v '.svn' | wc -l
94

On Sun, Sep 21, 2008 at 11:30 AM, chromatic <[EMAIL PROTECTED]> wrote:
> On Friday 19 September 2008 21:46:53 Chris Davaz wrote:
>
>> In any-str.pir we need to figure out how to change
>>
>> .sub 'split' :method :multi('String')
>>
>> into
>>
>> .sub 'split' :method :multi(_, 'String')
>>
>> since the former method signature is causing problems for me as I'm
>> trying to implement
>>
>> .sub 'split' :method :multi(_, 'Sub')
>>
>> with an additional optional argument (the limit of how many elements
>> to return in the list).
>
> :method and :multi are fundamentally incompatible; the former implies single
> dispatch while the latter explicitly expresses multiple dispatch.  The PIR
> compiler *could* unshift a type constraint onto the list of invocants
> when :multi and :method appear together, but I'm not sure that produces clear
> code.
>
> -- c
>


Re: method signature issues

2008-09-21 Thread Chris Davaz
Patrick,

Any thoughts on why I am getting the "No applicable methods" error as
described in the head of this thread?

On Sun, Sep 21, 2008 at 11:38 PM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> On Sat, Sep 20, 2008 at 11:05:34PM -0700, chromatic wrote:
>> On Saturday 20 September 2008 22:24:52 Chris Davaz wrote:
>>
>> > If it is the case that :method and :multi are incompatible, I am a bit
>> > surprised to see that in the Rakudo src directory:
>>
>> I said they're incompatible -- meant in terms of their semantics.  I didn't
>> say they don't work together in some cases with our current implementation.
>> (They probably shouldn't.)
>
> :method and :multi work just fine together -- at least within a single
> class.  In fact, the various compilers in PCT (PAST::Compiler and
> POST::Compiler) rely on multiple dispatch working properly for methods.
>
> Where the semantics get a little weird is when a subclass defines
> a multimethod that overloads multimethods of the same name in the
> parent class -- in this case the subclass multimethod completely
> hides the parent class multimethod.
>
> Pm
>


Re: method signature issues

2008-09-21 Thread Chris Davaz
Awesome Patrick, you totally nailed it ;-)

I'll be submitting a patch soon. Do you know if there is a Parrot bug
logged for the problem you described?

On Mon, Sep 22, 2008 at 12:27 AM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> On Sat, Sep 20, 2008 at 12:46:53PM +0800, Chris Davaz wrote:
>> In any-str.pir we need to figure out how to change
>> .sub 'split' :method :multi('String')
>> into
>> .sub 'split' :method :multi(_, 'String')
>> [...]
>
> ... let's back up a bit and look at what is really happening.
>
>> [...] Any when we change ".sub 'split' :method :multi('String')"
>> to ".sub 'split' :method :multi(_, 'String')" I can't even compile
>> Perl 6. I get the following error:
>> No applicable methods.
>> current instr.: 'parrot;Perl6;Grammar;Actions;dec_number' pc 129924
>> (src/gen_actions.pir:11299)
>
> This isn't precise -- perl6.pbc compiles just fine.  What isn't
> compiling is Test.pm, because the dec_number method in actions.pm
> is using the 'split' builtin to get rid of underscores (added in r31225):
>
>method dec_number($/) {
>my $num := ~$/;
>$num := $num.split('_').join('');
>make PAST::Val.new( :value( $num ), :returns('Num'), :node( $/ ) );
>}
>
> Here we're calling a Perl 6 builtin function from NQP, and NQP
> doesn't convert string constants (such as '_') into String PMCs
> prior to calling the function -- it just generates a PIR method
> call directly.  Unfortunately, Parrot doesn't recognize a string
> constant as being the same as a 'String' for MMD purposes, and
> so it's unable to match the split method declared :multi(_, 'String').
> (This is arguably a Parrot bug, either in design or implementation.)
>
> The solution to the problem is to define the method to split
> on strings as :multi(_, _) -- i.e.:
>
>.namespace ['Any']
>.sub 'split' :method :multi(_, _)
>
> This means this method will be used whenever there's not a more
> specific multimethod available.  Furthermore, this is actually
> the correct semantics -- i.e., we expect the following to work
> even though $x is an Int and not a Str:
>
>my $x = 9;
>say 439123912.split($x).perl;   #  [ "43", "123", "12" ]
>
> So, try changing :multi('String')  to :multi(_,_)  and everything
> should work just fine (and we should get a few more passing tests
> to boot).
>
> Pm
>


Some fixes to split methods

2008-09-21 Thread Chris Davaz
I have implemented the limit parameter on both Str.split(String,
Integer) and Str.split(Regex, Integer). In doing so I had to change
the method signature of Str.split(String) to ".sub 'split' :method
:multi(_, _)" from ".sub 'split' :method :multi('String')". The former
method signature is the correct one anyway as the latter just
restricts the invoker to being a 'String' and doesn't say anything
about the argument type.

All the tests except for one in split.p6 "pass" (verify by just
looking at the output). The one that doesn't pass is:

say 102030405.split(0).perl;

which produces:

["1.", "2", "3e+", "8"]

Note that this bug is not introduced by the patch, it was only
discovered because now we can do Int.split(Int).
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31332)
+++ src/builtins/any-str.pir	(working copy)
@@ -197,8 +197,10 @@
 .end
 
 .namespace['Any']
-.sub 'split' :method :multi('String')
+.sub 'split' :method :multi(_, _)
 .param string delim
+.param int count:optional
+.param int has_count:opt_flag
 .local string objst
 .local pmc pieces
 .local pmc tmps
@@ -214,6 +216,10 @@
 len = pieces
 i = 0
   loop:
+unless has_count goto skip_count
+dec count
+if count < 0 goto done
+  skip_count:
 if i == len goto done
 
 tmps = new 'Perl6Str'
@@ -229,6 +235,8 @@
 
 .sub 'split' :method :multi(_, 'Sub')
 .param pmc regex
+.param int count:optional
+.param int has_count:opt_flag
 .local pmc match
 .local pmc retv
 .local int start_pos
@@ -244,6 +252,10 @@
 goto done
 
   loop:
+unless has_count goto skip_count
+dec count
+if count < 0 goto done
+  skip_count:
 match = regex($S0, 'continue' => start_pos)
 end_pos = match.'from'()
 end_pos -= start_pos


split.p6
Description: Binary data


[perl #59184] Some fixes to split methods

2008-09-22 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59184]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59184 >


I have implemented the limit parameter on both Str.split(String,
Integer) and Str.split(Regex, Integer). In doing so I had to change
the method signature of Str.split(String) to ".sub 'split' :method
:multi(_, _)" from ".sub 'split' :method :multi('String')". The former
method signature is the correct one anyway as the latter just
restricts the invoker to being a 'String' and doesn't say anything
about the argument type.

All the tests except for one in split.p6 "pass" (verify by just
looking at the output). The one that doesn't pass is:

say 102030405.split(0).perl;

which produces:

["1.", "2", "3e+", "8"]

Note that this bug is not introduced by the patch, it was only
discovered because now we can do Int.split(Int).
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31332)
+++ src/builtins/any-str.pir	(working copy)
@@ -197,8 +197,10 @@
 .end
 
 .namespace['Any']
-.sub 'split' :method :multi('String')
+.sub 'split' :method :multi(_, _)
 .param string delim
+.param int count:optional
+.param int has_count:opt_flag
 .local string objst
 .local pmc pieces
 .local pmc tmps
@@ -214,6 +216,10 @@
 len = pieces
 i = 0
   loop:
+unless has_count goto skip_count
+dec count
+if count < 0 goto done
+  skip_count:
 if i == len goto done
 
 tmps = new 'Perl6Str'
@@ -229,6 +235,8 @@
 
 .sub 'split' :method :multi(_, 'Sub')
 .param pmc regex
+.param int count:optional
+.param int has_count:opt_flag
 .local pmc match
 .local pmc retv
 .local int start_pos
@@ -244,6 +252,10 @@
 goto done
 
   loop:
+unless has_count goto skip_count
+dec count
+if count < 0 goto done
+  skip_count:
 match = regex($S0, 'continue' => start_pos)
 end_pos = match.'from'()
 end_pos -= start_pos


split.p6
Description: Binary data


more fixes to split

2008-09-22 Thread Chris Davaz
please see http://rt.perl.org/rt3/Ticket/Display.html?id=59184 for
more info and for the patch


[perl #59240] Automate publishing of docs/*

2008-09-23 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59240]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59240 >


I suggest we automate the publishing of everything under docs/* and
putting it under parrotcode.org/docs in HTML format for easy access.
This would probably help productivity and maybe even attract new
developers.

Just my two cents...


Re: [perl #59240] Automate publishing of docs/*

2008-09-24 Thread Chris Davaz
Ahh, cool I didn't even know we had parrot.org. Publishing docs/book/*
would be nice.

On Tue, Sep 23, 2008 at 10:27 PM, Will Coleda via RT
<[EMAIL PROTECTED]> wrote:
> On Tue, Sep 23, 2008 at 9:04 AM, via RT Chris Davaz
> <[EMAIL PROTECTED]> wrote:
>> # New Ticket Created by  "Chris Davaz"
>> # Please include the string:  [perl #59240]
>> # in the subject line of all future correspondence about this issue.
>> # http://rt.perl.org/rt3/Ticket/Display.html?id=59240 >
>>
>>
>> I suggest we automate the publishing of everything under docs/* and
>> putting it under parrotcode.org/docs in HTML format for easy access.
>> This would probably help productivity and maybe even attract new
>> developers.
>>
>> Just my two cents...
>>
>
> we're moving to http://www.parrot.org/ ; I think Allison has said
> we're going to automate this publishing process.
>
> In the meantime, most of the docs on parrotcode.org update
> automatically once a link is put up; we can, in the meantime, add some
> more if there are some particular docs you'd like to see.
>
>
> --
> Will "Coke" Coleda
>
>
>


Re: Split with negative limits, and other weirdnesses

2008-09-25 Thread Chris Davaz
If someone wants to make the final word on what the behavior should be
I can go ahead and implement it.

On Tue, Sep 23, 2008 at 11:41 PM, Jonathan Scott Duff
<[EMAIL PROTECTED]> wrote:
> On Tue, Sep 23, 2008 at 9:38 AM, TSa <[EMAIL PROTECTED]> wrote:
>
>> HaloO,
>> Moritz Lenz wrote:
>>
>>> In Perl 5 a negative limit means "unlimited", which we don't have to do
>>> because we have the Whatever star.
>>>
>>
>> I like the notion of negative numbers as the other end of infinity.
>> Where infinity here is the length of the split list which can be
>> infinite if split is called on a file handle. So a negative number
>> could be the number of splits to skip from the front of the list.
>> And limits of the form '*-5' would deliver the five last splits.
>>
>
> As another data point, this is the first thing I thought of when I read the
> email regarding negative limits.  But then I thought "we're trying to get
> away from so much implicit magic". And I'm not sure the failure mode is loud
> enough when the skip-from-the-front semantics /aren't/ what you want (e.g.,
> when the limit parameter is variable-ish)
>
>
>  A limit of 0 is basically ignored.
>>
>> Here are a few solution I could think of
>>  1) A limit of 0 returns the empty list (you want zero items, you get them)
>>
>
> I think this is a nice degenerate case.
>>
>
> Me too.
>
>
>  2) A limit of 0 fail()s
>>
>
> This is a bit too drastic.
>
>
> Indeed.
>
>
>
>  3) non-positive $limit arguments are rejected by the signature (Int
>> where { $_ > 0 })
>>
>
> I think that documents and enforces the common case best. But I would
>> include zero and use a name like UInt that has other uses as well. Are
>> there pragmas that turn signature failures into undef return values?
>>
>>
>> Regards, TSa.
>> --
>>
>> "The unavoidable price of reliability is simplicity" -- C.A.R. Hoare
>> "Simplicity does not precede complexity, but follows it." -- A.J. Perlis
>> 1 + 2 + 3 + 4 + ... = -1/12  -- Srinivasa Ramanujan
>>
>
>
> my two cents,
>
> -Scott
>
> --
> Jonathan Scott Duff
> [EMAIL PROTECTED]
>


Re: [perl #59184] Some fixes to split methods

2008-09-25 Thread Chris Davaz
Nope, that last one was it. Still waiting on a decision for how edge
cases on limit are to be handled.

On Thu, Sep 25, 2008 at 10:49 PM, Moritz Lenz via RT
<[EMAIL PROTECTED]> wrote:
> On Mon Sep 22 22:55:29 2008, cdavaz wrote:
>> Grr.. wrong again sorry!! Forgot to remove the handle_count label.
>> Please let me know if you see any problems.
>
> Sorry, I've lost track of your patches (probably due to our parallel
> mailing list/RT/IRC conversations); are there still split() patches open
> which should be applied?
>


[perl #59366] small fix to pod doc and interactive prompt

2008-09-26 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59366]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59366 >


Fixed a bug in the doc where the method name and doc where mismatched.

Fixed a small bug where, even if the user sets the prompt, a default
prompt '> ' is still printed. Changed it so that a default prompt is
only printed only if the user had not set a prompt (or set an empty
prompt).
Index: compilers/pct/src/PCT/HLLCompiler.pir
===
--- compilers/pct/src/PCT/HLLCompiler.pir	(revision 31432)
+++ compilers/pct/src/PCT/HLLCompiler.pir	(working copy)
@@ -135,16 +135,16 @@
 
 =item commandline_banner([string value])
 
+Set the string in $S0 as a commandline banner on the compiler in $P0.  The
+banner is the first text that is shown when the com‐ piler is started in
+interactive mode. This can be used for a copyright notice or other information.
+
+=item commandline_prompt([string value])
+
 Set the string in $S0 as a commandline prompt on the compiler in $P0.  The
 prompt is the text that is shown on the commandline before a command is entered
 when the compiler is started in interactive mode.
 
-=item commandline_prompt([string value])
-
-Set the string in $S0 as a commandline banner on the compiler in $P0.  The
-banner is the first text that is shown when the com‐ piler is started in
-interactive mode. This can be used for a copyright notice or other information.
-
 =cut
 
 .sub 'stages' :method
@@ -511,6 +511,7 @@
 if encoding == 'fixed_8' goto interactive_loop
 unless encoding goto interactive_loop
 push stdin, encoding
+
   interactive_loop:
 .local pmc code
 unless stdin goto interactive_end
@@ -518,11 +519,14 @@
 ##  libraries aren't present (RT #41103)
 
 # for each input line, print the prompt
-$P0 = self.'commandline_prompt'()
-printerr $P0
+$S3 = self.'commandline_prompt'()
+ne_str $S3, '', prompt_set
+$S3 = '> '
+  prompt_set:
+printerr $S3
 
 if has_readline < 0 goto no_readline
-code = stdin.'readline'('> ')
+code = stdin.'readline'('')
 if null code goto interactive_end
 concat code, "\n"
 goto have_code
Index: config/gen/languages.pm
===
--- config/gen/languages.pm	(revision 31432)
+++ config/gen/languages.pm	(working copy)
@@ -54,6 +54,7 @@
 unlambda urm
 WMLScript
 Zcode
+monkey
 };
 $data{languages_source} = q{config/gen/makefiles/languages.in};
 return \%data;


Re: Split with negative limits, and other weirdnesses

2008-09-28 Thread Chris Davaz
Ok, so 0 returns the empty list and -1 violates the signature? In PIR
can we have such signatures that put a constraint on the range of
values for a given parameter?

On Sun, Sep 28, 2008 at 7:25 PM, Carl Mäsak <[EMAIL PROTECTED]> wrote:
> Jason (>):
>> It makes sense to me to go with option 1; you get what you ask for. It also
>> makes sense to make to not use magical implied numbers, such as negatives,
>> to accomplish things that either ranges or whatever star can accomplish.
>
> Aye, agreement. There's a whole lot of consensus already... reading
> through the discussion once more, I don't find anyone saying anything
> contradicting the above summary.
>
> Chris, I'm not in a position to provide a final word, but it seems
> very possible already to use what has already been said here as a
> basis for an implementation.
>
> // Carl
>


[perl #59642] Return the empty list on non-positive LIMIT

2008-10-06 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #59642]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=59642 >


Here is a small patch to make split return the empty list on
non-positive limit arguments instead of returning the string split
entirely. Moritz, I think it was you who suggested that negative
arguments be 'rejected by the signature' but I am not sure how to do
this in PIR.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 31687)
+++ src/builtins/any-str.pir	(working copy)
@@ -411,8 +411,7 @@
 
 # per Perl 5's negative LIMIT behavior
 unless has_count goto positive_count
-unless count < 1 goto positive_count
-has_count = 0
+if count < 1 goto done
 
   positive_count:
 match = regex($S0)


[perl #60228] fix for split on zero-width

2008-10-30 Thread Chris Davaz
# New Ticket Created by  "Chris Davaz" 
# Please include the string:  [perl #60228]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=60228 >


This is a fix for splitting strings on regular expressions that
contain zero-width matches. All the tests in split-simple.t pass.
Index: src/builtins/any-str.pir
===
--- src/builtins/any-str.pir	(revision 32228)
+++ src/builtins/any-str.pir	(working copy)
@@ -428,6 +428,7 @@
 .local pmc retv
 .local int start_pos
 .local int end_pos
+.local int zwm_start
 
 $S0 = self
 retv = new 'List'
@@ -450,10 +451,23 @@
 $S1 = substr $S0, start_pos
 retv.'push'($S1)
 goto done
+  next_zwm:
+zwm_start = start_pos
+  inc_zwm:
+inc start_pos
+match = regex($S0, 'continue' => start_pos)
+end_pos = match.'from'()
+unless start_pos == end_pos goto inc_zwm
+start_pos = zwm_start
+end_pos -= start_pos
+goto add_str
   skip_count:
 match = regex($S0, 'continue' => start_pos)
 end_pos = match.'from'()
+$I99 = match.'to'()
+if $I99 == end_pos goto next_zwm
 end_pos -= start_pos
+  add_str:
 $S1 = substr $S0, start_pos, end_pos
 retv.'push'($S1)
 unless match goto done