[svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread larry
Author: larry
Date: Thu Apr 20 02:07:51 2006
New Revision: 8883

Modified:
   doc/trunk/design/syn/S05.pod

Log:
Various clarifications.
Documented that null first alternative is ignored.
Removed colon separator after last modifier, now just use space.
Deleted the :once modifier.  (A state variable suffices.)
A match object in boolean context isn't always forced to be eager.
Added :ratchet and :panic modifiers to limit backtracking in the parser.
Clarified when rules are allowed vs enforced in variable usage.
Added <%a|%b|%c> form for simple longest-token scoping.
Clarified that hash matches skip over key before value is matched.
Documented behavior of $.
Added *+ ++ ?+ and :+ to force greed on specific atom.
Added token and parse rule variants for grammar productions.
Added <<<...>>> syntax.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podThu Apr 20 02:07:51 2006
@@ -11,11 +11,11 @@
 
 =head1 VERSION
 
-   Maintainer: Patrick Michaud <[EMAIL PROTECTED]>
+   Maintainer: Patrick Michaud <[EMAIL PROTECTED]> (& TimToady)
Date: 24 Jun 2002
-   Last Modified: 6 Apr 2006
+   Last Modified: 20 Apr 2006
Number: 5
-   Version: 15
+   Version: 16
 
 This document summarizes Apocalypse 5, which is about the new regex
 syntax.  We now try to call them "rules" because they haven't been
@@ -30,8 +30,8 @@
 it doesn't look like it.  The individual capture variables (such as C<$0>,
 C<$1>, etc.) are just elements of C<$/>.
 
-By the way, the numbered capture variables now start at C<$0>, C<$1>,
-C<$2>, etc. See below.
+By the way, the numbered capture variables now start at C<$0> rather than
+C<$1>.  See below.
 
 =head1 Unchanged syntactic features
 
@@ -68,6 +68,8 @@
 =item *
 
 The extended syntax (C) is no longer required...it's the default.
+(In fact, it's pretty much mandatory--the only way to get back to
+the old syntax is with the C<:Perl5>/C<:P5> modifier.)
 
 =item *
 
@@ -78,7 +80,11 @@
 
 There is no C evaluation modifier on substitutions; instead use:
 
- s/pattern/{ code() }/
+ s/pattern/{ doit() }/
+
+Instead of C say:
+
+ s/pattern/{ eval doit() }/
 
 =item *
 
@@ -87,8 +93,9 @@
  m:g:i/\s* (\w*) \s* ,?/;
 
 Every modifier must start with its own colon.  The delimiter must be
-separated from the final modifier by a colon or whitespace if it would
-be taken as an argument to the preceding modifier.
+separated from the final modifier by whitespace if it would be taken
+as an argument to the preceding modifier (which is true for any
+bracketing character).
 
 =item *
 
@@ -127,19 +134,13 @@
 
 is roughly equivalent to
 
- m:p/.*? pattern/
-
-=item *
-
-The new C<:once> modifier replaces the Perl 5 C syntax:
+ m:p/.*? <( pattern )> /
 
- m:once/ pattern /# only matches first time
+Also note that any rule called as a subrule is implicitly anchored to the
+current position anyway.
 
 =item *
 
-[Note: We're still not sure if :w is ultimately going to work exactly 
-as described below.  But this is how it works for now.]
-
 The new C<:w> (C<:words>) modifier causes whitespace sequences to be
 replaced by C<\s*> or C<\s+> subpattern as defined by the C<<  >> rule.
 
@@ -164,6 +165,9 @@
 C<<  >> can't decide what to do until it sees the data.  It still does
 the right thing.  If not, define your own C<<  >> and C<:w> will use that.
 
+In general you don't need to use C<:w> within grammars because
+the parse rules automatically handle whitespace policy for you.
+
 =item *
 
 New modifiers specify Unicode level:
@@ -177,9 +181,9 @@
 
 =item *
 
-The new C<:perl5> modifier allows Perl 5 regex syntax to be used instead:
+The new C<:Perl5> modifier allows Perl 5 regex syntax to be used instead:
 
- m:perl5/(?mi)^[a-z]{1,2}(?=\s)/
+ m:Perl5/(?mi)^[a-z]{1,2}(?=\s)/
 
 (It does not go so far as to allow you to put your modifiers at
 the end.)
@@ -194,16 +198,16 @@
 If followed by an C, it means repetition.  Use C<:x(4)> for the
 general form.  So
 
- s:4x { () = (\N+) $$}{$0 => $1};
+ s:4x [ () = (\N+) $$] [$0 => $1];
 
 is the same as:
 
- s:x(4) { () = (\N+) $$}{$0 => $1};
+ s:x(4) [ () = (\N+) $$] [$0 => $1];
 
 which is almost the same as:
 
  $_.pos = 0;
- s:c{ () = (\N+) $$}{$0 => $1} for 1..4;
+ s:c [ () = (\N+) $$] [$0 => $1] for 1..4;
 
 except that the string is unchanged unless all four matches are found.
 However, ranges are allowed, so you can say C<:x(1..4)> to change anywhere
@@ -250,10 +254,15 @@
  $str = "abracadabra";
 
  if $str ~~ m:exhaustive/ a (.*) a / {
- @substrings = $/.matches();# br brac bracad bracadabr
-# c cad cadabr d dabr br
+ say "@()";# br brac bracad bracadabr c cad cadabr d dabr br
  }
 
+Note that the C<~~> above can return as soon as the first match is found,
+and the rest of 

Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Daniel Hulme
> +but rather easier to read.  The bare C<*>, C<+> and C quantifiers
> +never backtrack in a C unless some outer rule has specified a
> +C<:panic> option that applies.  If you want to prevent even that, use
> +C<*:>, C<+:> or C to prevent any backtracking into the quantifier.
> +If you want to explicitly backtrack, append either a C or a C<+>
> +to the quantifier.   The C forces minimal matching as usual,
> +while the C<+> forces greedy matching.  The C declarator is
> +really just short for
> +
> +rule :ratchet { ... }
> +
> +The other is the C declarator, for declaring non-terminal
> +productions in a grammar.  It also does not backtrack unless a
> +C<:panic> is in effect or you explicitly specify a backtracking
> +quantifier.  In addition, a C rule also assumes C<:words>.

I really don't like the second-to-last sentence above ("It also does not...").
It took me a few reads-through to parse it, and it sounds like it means, "Like
c, it does not backtrack unless a C<:panic> is in effect. In addition, it
does not backtrack if you explicitly specify a backtracking quantifier."

Perhaps you could reword the end of that paragraph as:

>>>
Like C, it only backtracks when a C<:panic> is in effect or when you
explicitly specify a backtracking quantifier. Unlike C, it also assumes
C<:words>, making it equivalent to

rule :ratchet :words { ... }
<<<

-- 
You can't run away  forever,  but there's  nothing wrong with  getting a
good head start.  You want to shut out the night,  you want to shut down
the  sun,  you  want  to  shut  away  the  pieces  of  a  broken  heart.
`Rock and Roll Dreams Come True' (Steinman)http://surreal.istic.org/


signature.asc
Description: Digital signature


Re: TODO tests and test::harness

2006-04-20 Thread Steve Peters
On Wed, Apr 19, 2006 at 07:22:33AM +0200, demerphq wrote:
> On 4/19/06, Andy Lester <[EMAIL PROTECTED]> wrote:
> > > BTW, the patch only shows TODO pass status when no failures occur.
> > >
> > > Oh and obviously all of Test::Harness'es tests pass. :-)
> >
> > This patch doesn't apply against my latest dev version of
> > Test::Harness.  I'm going to have to massage it manually.
> >
> > But I like the idea.  Thanks.
> 
> You're welcome. If it helps It was against Test-Harness-2.56.
> 

Maybe I'm thinking too hard, or maybe the results reported aren't
exactly as clear as they probably should be.  Here's an example test and
its results as reported by Test::Harness with the TODO changes.

#!perl -w

use strict;
use Test::More qw(no_plan);

TODO: {
local $TODO = "TODO testing";
is(1, 2, "A failing test");
is(1, 1, "A passing test");
}
[EMAIL PROTECTED]:~/smoke/perl-current/t$ ./perl harness th_test.t
th_testok
1/2 unexpectedly succeeded
TODO PASSED tests 1-2

All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED).
Passed Test Stat Wstat Total Pass  Passed  List of Passed
---
th_test.t  21  50.00%  1-2
Files=1, Tests=2,  0 wallclock secs ( 0.11 cusr +  0.01 csys =  0.12
CPU)

The line starting TODO PASSED shows all TODO tests, not those that
unexpectedly succeeded, which confused me a bit.  Also, the final
results show that one test passed, but then the list of passed is "1-2"
instead of just "2" which is the unexpected success.  Is there a way to
have the list of passed just show the unexpected successes?

Steve Peters
[EMAIL PROTECTED] 


signature.asc
Description: Digital signature


The "parse" composer

2006-04-20 Thread Audrey Tang
[EMAIL PROTECTED] wrote:
> +=item *
> +
> +Just as C has variants, so does the C declarator.
> +In particular, there are two special variants for use in grammars:
> +C and C.

After a brief discussion on #perl6 with pmichaud and Juerd, it seems
that a verb "parse" at the same space as "sub"/"method"/"rule" feels
quite confusing:

grammar Foo {
parse moose {...}; # calling &parse?
}
my $elk = parse {...}; # calling &parse?

We feel that the token:w form is short enough and better reflect the
similarity:

grammar Foo {
token moose :w {...}
}
my $elk = token:w {...};

If further huffmanization is highly desired, how about allowing adverbs
at the beginning of token/rule forms?

grammar Foo {
token:w moose {...};
rule:P5 foo {...};
}

That would make it stand out, without further consuming the reserved
word space.

Thanks,
Audrey



signature.asc
Description: OpenPGP digital signature


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
First, let me say I really like the changes to S05.  Good work
once again.

Here are my questions and comments.

On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
> -(To get rule interpolation use an assertion - see below)
> +However, if C<$var> contains a rule object, rather attempting to
> +convert it to a string, it is called as if you said C<< <$var> >>.

Does this mean it's a capturing rule?  Or is it called as
if one had said  C<<  >>?   (I would prefer it default
to non-capturing.)

> +If it is a string, it is matched literally, starting after where the
> +key left off matching.
> ..
> +If it is a rule object, it is executed as a subrule, with an initial
> +position after the matched key.
> ..
> +If it has the value 1, nothing special happens except that the key match
> +succeeds.
> ..
> +Any other value causes the match to fail.  In particular, shorter keys
> +are not tried if a longer one matches and fails.

Is there a way to say to continue with the next shortest key?

> +Note: the effect of a forward-scanning lookbehind at the top level
> +can be achieved with:
> +
> +/ .*? prestuff <( mainpat >) /

That should probably be

/ .*? prestuff <( mainpat )> /


> +As with bare hash, the longest key matches according to the longest token
> +rule, but in addition, you may combine multiple hashes under the same
> +longest-token consideration like this:
> +
> +<%statement|%prefix|%term>

This will be interesting from an implementation perspective.  :-)

> +It is a syntax error to use an unbalanced C<< <( >> or C<< )> >>.

On #perl6 I think it was discussed that C<< <( >> and C<< )> >>
could be unbalanced -- that the first simply set the "from"
position and the second set the "to/pos" position.  I think I
would prefer this.

Assuming we require the balance, what do we do with things like...?

/ aaa <( bbb { return 0; } ccc )> ddd /

And are we excluding the possibility of:

/ aaa <( [ bbb )> ccc 
 | dd ee )> ff 
 ]
/

(The last example might be the anti-use case that shows that
<( and )> ought to be properly nested and balanced.)

> +Conjecture: Multiple opening angles are matched by a corresponding
> +number of closing angles, and otherwise function as single angles.
> +This can be used to visually isolate unmatched angles inside:
> +
> +<<> 1>>>

Does this eliminate the possibility of ever using french angles
as a possible rule syntax character?  (It's okay if it does, 
I simply wanted to make the observation.)

> +Just as C has variants, so does the C declarator.
> +In particular, there are two special variants for use in grammars:
> +C and C.

I agree with Audrey that C is probably too useful in other
contexts.  C works fine for me.

> +With C<:global> or C<:overlap> or C<:exhaustive> the boolean is
> +allowed to return true on the first match.  

Nice, nice, nice!  Makes things *much* simpler for PGE.

Pm


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
On Thu, Apr 20, 2006 at 09:24:09AM -0500, Patrick R. Michaud wrote:
> First, let me say I really like the changes to S05.  Good work
> once again.
> 
> Here are my questions and comments.
> 
> On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
> > -(To get rule interpolation use an assertion - see below)
> > +However, if C<$var> contains a rule object, rather attempting to
> > +convert it to a string, it is called as if you said C<< <$var> >>.
> 
> Does this mean it's a capturing rule?  Or is it called as
> if one had said  C<<  >>?   (I would prefer it default
> to non-capturing.)

Sorry, I meant C<<  >> here, except we don't really 
have a  syntax, so my question is just if it's capturing
or non-capturing.  (I still prefer non-capturing.)

Pm



Re: TODO tests and test::harness

2006-04-20 Thread demerphq
On 4/20/06, Steve Peters <[EMAIL PROTECTED]> wrote:
> On Wed, Apr 19, 2006 at 07:22:33AM +0200, demerphq wrote:
> > On 4/19/06, Andy Lester <[EMAIL PROTECTED]> wrote:
> > > > BTW, the patch only shows TODO pass status when no failures occur.
> > > >
> > > > Oh and obviously all of Test::Harness'es tests pass. :-)
> > >
> > > This patch doesn't apply against my latest dev version of
> > > Test::Harness.  I'm going to have to massage it manually.
> > >
> > > But I like the idea.  Thanks.
> >
> > You're welcome. If it helps It was against Test-Harness-2.56.
> >
>
> Maybe I'm thinking too hard, or maybe the results reported aren't
> exactly as clear as they probably should be.

I think thats probably true.

>  Here's an example test and its results as reported by Test::Harness with the
> TODO changes.
>
> #!perl -w
>
> use strict;
> use Test::More qw(no_plan);
>
> TODO: {
> local $TODO = "TODO testing";
> is(1, 2, "A failing test");
> is(1, 1, "A passing test");
> }
> [EMAIL PROTECTED]:~/smoke/perl-current/t$ ./perl harness th_test.t
> th_testok
> 1/2 unexpectedly succeeded
> TODO PASSED tests 1-2
>
> All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED).
> Passed Test Stat Wstat Total Pass  Passed  List of Passed
> ---
> th_test.t  21  50.00%  1-2
> Files=1, Tests=2,  0 wallclock secs ( 0.11 cusr +  0.01 csys =  0.12
> CPU)
>
> The line starting TODO PASSED shows all TODO tests, not those that
> unexpectedly succeeded, which confused me a bit.  Also, the final
> results show that one test passed, but then the list of passed is "1-2"
> instead of just "2" which is the unexpected success.  Is there a way to
> have the list of passed just show the unexpected successes?

I have to admit im flummoxed on this one. The first number '2' is the
number of tests in the file. The next number '1' is the number of
problematic results. The %50.00 shows the percentage that are
problematic. So everything checks out up till there. But then the list
of failures is wrong. Which I dont get at all.

I had a look at the results when i did the patch i definately didnt
see this result. So far i dont see the cause, but i do see a subtle
bug that I hadnt noticed before.

The line that says:

   failed  => $test{bonus},

should probably read

   failed => scalar @{$test{todo_pass}}

But i still dont see why the list is wrong. Ill have to investigate
further later.

Cheers,
Yves






--
perl -Mre=debug -e "/just|another|perl|hacker/"


Re: Non-Perl TAP implementations (and diag() problems)

2006-04-20 Thread Adrian Howard


On 19 Apr 2006, at 09:02, Ovid wrote:
[snip]

From a parser standpoint, there's no clean way of distinguishing that
from what the test functions are going to output.  As a result, I
really think that "diag" and normal test failure information should be
marked differently (instead of the /^# / that we see).

[snip]

I've thought in the past about about using /^## / for non-test  
related diagnostics


## Start the fribble tests
ok 1 - fribble foo
not ok 2 - fribble bar
#   Failed test 'fribble bar'
#   in untitled text 2 at line 5.
#  got: 'baz'
# expected: 'bar'
## Start the blart tests
# ok 1 - blart foo
... etc ...

Reads reasonably to me and has the advantage of being backward  
compatible.


?

Adrian


Re: Non-Perl TAP implementations (and diag() problems)

2006-04-20 Thread Michael Peters


Ovid wrote:
> --- Adrian Howard <[EMAIL PROTECTED]> wrote:
>> I've thought in the past about about using /^## / for non-test  
>> related diagnostics
>>
>> ## Start the fribble tests
>> ok 1 - fribble foo
>> not ok 2 - fribble bar
>> #   Failed test 'fribble bar'
>> #   in untitled text 2 at line 5.
>> #  got: 'baz'
>> # expected: 'bar'
>> ## Start the blart tests
>> # ok 1 - blart foo
>> ... etc ...
>>
>> Reads reasonably to me and has the advantage of being backward  
>> compatible.
> 
> Hmm, I was thinking something more along the lines of ">" or "*" to
> make it more visually distinctive, but backward compatibility is a
> glorious thing.  I don't mind "##".

I'm not sure I agree that there is a difference between them. They are
both comments output by the tests. Just because one comes from the
testing routine used by the test and the other from the test itself
doesn't mean they aren't both just human readable comments on the test run.

-- 
Michael Peters
Developer
Plus Three, LP



Re: Use case testing of Web apps with Perl?

2006-04-20 Thread Adrian Howard

On 19 Apr 2006, at 17:12, Andrew Gianni wrote:
[snip]
We'd like to be a bit more programmatic about writing our mech  
tests to test
use-case driven test-cases. I'm wondering if there are any tools or  
ideas

out there to ease the process so we don't have to manually write the
numerous mech tests individually or develop our own framework for  
this.


Any recommendations are appreciated.

[snip]

I'll second Luke's recommendation of Selenium (and related firefox  
plugins.) Damn fine.


If you're willing to play with Ruby WAITR is well worth a look  
.


If you want to stick with Perl Samie   
may be worth playing with - I find it painful compared to WAITR  
myself though.


And of course there is the venerable HTTP::Recorder (see http:// 
www.perl.com/pub/a/2004/06/04/recorder.html for a tutorial.)


Cheers,

Adrian


Re: Non-Perl TAP implementations (and diag() problems)

2006-04-20 Thread Ovid
--- Adrian Howard <[EMAIL PROTECTED]> wrote:
> I've thought in the past about about using /^## / for non-test  
> related diagnostics
> 
> ## Start the fribble tests
> ok 1 - fribble foo
> not ok 2 - fribble bar
> #   Failed test 'fribble bar'
> #   in untitled text 2 at line 5.
> #  got: 'baz'
> # expected: 'bar'
> ## Start the blart tests
> # ok 1 - blart foo
> ... etc ...
> 
> Reads reasonably to me and has the advantage of being backward  
> compatible.

Hmm, I was thinking something more along the lines of ">" or "*" to
make it more visually distinctive, but backward compatibility is a
glorious thing.  I don't mind "##".

Anyone else?

Cheers,
Ovid

-- 
If this message is a response to a question on a mailing list, please send 
follow up questions to the list.

Web Programming with Perl -- http://users.easystreet.com/ovid/cgi_course/


Re: Non-Perl TAP implementations (and diag() problems)

2006-04-20 Thread Adrian Howard


On 20 Apr 2006, at 16:55, Michael Peters wrote:
[snip]

I'm not sure I agree that there is a difference between them. They are
both comments output by the tests. Just because one comes from the
testing routine used by the test and the other from the test itself
doesn't mean they aren't both just human readable comments on the  
test run.

[snip]

It's useful to distinguish between them for things like home-brew  
test runners - so I can accurately determine which diagnostics are  
associated with a particular test failure, and which ones are just  
informative.


Adrian


Re: Non-Perl TAP implementations (and diag() problems)

2006-04-20 Thread Ovid
(Oops.  Accidentally sent this to Michael directly rather than to the
list.)

--- Michael Peters <[EMAIL PROTECTED]> wrote:
> I'm not sure I agree that there is a difference between them. They
> are
> both comments output by the tests. Just because one comes from the
> testing routine used by the test and the other from the test itself
> doesn't mean they aren't both just human readable comments on the
> test run.

  is 3, 2, 'foo';
  diag 'bar';
  is 4, 5, 'baz';

If you run those tests, you know that (assuming we can get around the
buffering problem) the lines beginning with "#" after the "not ok" line
are the failure messages associated with a given test.  Unfortunately,
where "diag 'bar'" belong?  Can we associate it with the first test or
the second?

There's no way of knowing what the programmer and you *can't* handle
diag correctly if you can't disambiguate it.  This is another reason
why writing automated tools for TAP is problematic.  There's more stuff
folks would like to do and we need these issues resolved to make those
things happen.

Cheers,
Ovid

-- 
If this message is a response to a question on a mailing list, please send 
follow up questions to the list.

Web Programming with Perl -- http://users.easystreet.com/ovid/cgi_course/


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Larry Wall
On Thu, Apr 20, 2006 at 09:24:09AM -0500, Patrick R. Michaud wrote:
: First, let me say I really like the changes to S05.  Good work
: once again.
: 
: Here are my questions and comments.
: 
: On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
: > -(To get rule interpolation use an assertion - see below)
: > +However, if C<$var> contains a rule object, rather attempting to
: > +convert it to a string, it is called as if you said C<< <$var> >>.
: 
: Does this mean it's a capturing rule?  Or is it called as
: if one had said  C<<  >>?   (I would prefer it default
: to non-capturing.)

I'd say the intent is non-capturing.  In fact, it seems like a machanism
for stealth rule injection.  It falls just a wee bit short of a security
hole, though, I think, since an interloper would have to be in the same
process to compile the rule.  We probably shouldn't try to run a tainted
rule, on the theory that the interloper tricked some other code into
compiling the stealth rule.

: > +If it is a string, it is matched literally, starting after where the
: > +key left off matching.
: > ..
: > +If it is a rule object, it is executed as a subrule, with an initial
: > +position after the matched key.
: > ..
: > +If it has the value 1, nothing special happens except that the key match
: > +succeeds.
: > ..
: > +Any other value causes the match to fail.  In particular, shorter keys
: > +are not tried if a longer one matches and fails.
: 
: Is there a way to say to continue with the next shortest key?

Yeah, use <@rules> rather than <%tokens>.  :)

Actually, how about we say that '' just succeeds, and a number says to
retry ignoring keys longer than the number?

: > +As with bare hash, the longest key matches according to the longest token
: > +rule, but in addition, you may combine multiple hashes under the same
: > +longest-token consideration like this:
: > +
: > +<%statement|%prefix|%term>
: 
: This will be interesting from an implementation perspective.  :-)

Has to be done somewhere anyway.  I'd rather the rule syntax grok the
notion than to sluff it off to some kind of magical hash constructor.
This way the rule knows exactly which hashes it has to track and cache.
It's also plain to the reader of the rule which syntactic categories
are being lumped together at this state in the parse.

: > +It is a syntax error to use an unbalanced C<< <( >> or C<< )> >>.
: 
: On #perl6 I think it was discussed that C<< <( >> and C<< )> >>
: could be unbalanced -- that the first simply set the "from"
: position and the second set the "to/pos" position.  I think I
: would prefer this.
: 
: Assuming we require the balance, what do we do with things like...?
: 
: / aaa <( bbb { return 0; } ccc )> ddd /
: 
: And are we excluding the possibility of:
: 
: / aaa <( [ bbb )> ccc 
:  | dd ee )> ff 
:  ]
: /
: 
: (The last example might be the anti-use case that shows that
: <( and )> ought to be properly nested and balanced.)

Lemme think about that some more.  I was worrying about accidental )>,
and not thinking about alternation.  Certainly your example could
be rewritten as

 / aaa [
   | <( bbb )> ccc 
   | <( dd ee )> ff 
   ]
 /

but there are obviously cases where it wouldn't work.  On the other
hand, there's perhaps some mental efficiency by lumping in <(...)>
with all the other <...> constructs, none of which can be unbalanced.
I'm inclined to say that the conservative thing is to require balance.
We could relax it later, I suppose.

: > +Conjecture: Multiple opening angles are matched by a corresponding
: > +number of closing angles, and otherwise function as single angles.
: > +This can be used to visually isolate unmatched angles inside:
: > +
: > +<<> 1>>>
: 
: Does this eliminate the possibility of ever using french angles
: as a possible rule syntax character?  (It's okay if it does, 
: I simply wanted to make the observation.)

Probably, unless we treat <<...>> as French angles specially, for which
there is something to be said.  I was just trying to make <<<...>>> consistent
with our other q<<<...>>> mechanisms, which recently switched to [[[...]]]
policy like POD has always had.

: > +Just as C has variants, so does the C declarator.
: > +In particular, there are two special variants for use in grammars:
: > +C and C.
: 
: I agree with Audrey that C is probably too useful in other
: contexts.  C works fine for me.

Aesthetically, I hate :w, actually...and the whole point of naming "token"
is that it is *not* a normal parser rule, but a lexer rule.

But I agree that "parse" is probably the wrong word.  Earlier versions
had "prod" (short for "production") or "words".  Even earlier
versions made ordinary "rule" have these semantics, but then it was
too confusing to talk about rules in general.  I was very happy when
I thought of splitting the concepts yesterday.

I will think about that some more today.  Consider "parse" a placeholder
for the

Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
On Thu, Apr 20, 2006 at 09:19:48AM -0700, Larry Wall wrote:
> : > +Any other value causes the match to fail.  In particular, shorter keys
> : > +are not tried if a longer one matches and fails.
> : 
> : Is there a way to say to continue with the next shortest key?
> 
> Yeah, use <@rules> rather than <%tokens>.  :)
> 
> Actually, how about we say that '' just succeeds, and a number says to
> retry ignoring keys longer than the number?

s/retry/continue trying/, perhaps?

Using '' (instead of 1) as the success value sounds Good, since 
null string always matches following a key.  Taking "ignoring keys
longer than the number" literally, would we also read this then 
that returning 0 tries the (remaining) empty keys of each hash, 
and returning -1 fails the matching of <%tokens>?

> [ discussion of unbalanced <( ... )>
> I'm inclined to say that the conservative thing is to require balance.
> We could relax it later, I suppose.

Works for me.

> : > +Just as C has variants, so does the C declarator.
> : > +In particular, there are two special variants for use in grammars:
> : > +C and C.
> : 
> : I agree with Audrey that C is probably too useful in other
> : contexts.  C works fine for me.
> 
> Aesthetically, I hate :w, actually...and the whole point of naming "token"
> is that it is *not* a normal parser rule, but a lexer rule.
> 
> But I agree that "parse" is probably the wrong word.  Earlier versions
> had "prod" (short for "production") or "words".  

Two other ideas (from a short walk)... how about something along
the lines of "phrase" or "sequence"?  

Pm


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Audrey Tang
Patrick R. Michaud wrote:
> Two other ideas (from a short walk)... how about something along
> the lines of "phrase" or "sequence"?  

Parsec use the word "lexeme" to mean exactly the same thing...

Audrey



signature.asc
Description: OpenPGP digital signature


[svn:perl6-synopsis] r8886 - doc/trunk/design/syn

2006-04-20 Thread pmichaud
Author: pmichaud
Date: Thu Apr 20 11:48:29 2006
New Revision: 8886

Modified:
   doc/trunk/design/syn/S12.pod

Log:
* Fixed "long dot" constructs to reflect new syntax.


Modified: doc/trunk/design/syn/S12.pod
==
--- doc/trunk/design/syn/S12.pod(original)
+++ doc/trunk/design/syn/S12.podThu Apr 20 11:48:29 2006
@@ -221,7 +221,7 @@
 .doit()# okay, no arguments
 .doit ()   # ILLEGAL (two terms in a row)
 .doit.()   # okay, no arguments, same as .doit()
-.doit .()  # okay, no arguments, same as .doit()
+.doit. .() # okay, no arguments, same as .doit() (long dot form)
 
 However, you can turn any of the legal forms above into a list
 operator by appending a colon:
@@ -230,7 +230,7 @@
 .doit(1): 2,3  # okay, one argument plus list
 .doit (): 1,2,3# ILLEGAL (two terms in a row)
 .doit.(1): 2,3 # okay, same as .doit(1,2,3)
-.doit .(1,2): 3# okay, same as .doit(1,2,3)
+.doit. .(1,2): 3   # okay, same as .doit(1,2,3)
 
 In particular, this allows us to pass a closure in addition to the
 "normal" arguments:


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Damian Conway

Larry wrote:

> : I agree with Audrey that C is probably too useful in other
> : contexts.  C works fine for me.
>
> Aesthetically, I hate :w, actually...and the whole point of naming "token"
> is that it is *not* a normal parser rule, but a lexer rule.
>
> But I agree that "parse" is probably the wrong word.  Earlier versions
> had "prod" (short for "production")

Just to point out to those playing along at home that a "production" is one 
branch of an alternation, so it was right to reject that as the keyword.


> or "words".

...which was not very informative. :-)


> Even earlier versions made ordinary "rule" have these semantics, but
> then it was too confusing to talk about rules in general. I was very
> happy when I thought of splitting the concepts yesterday.
>
> I will think about that some more today.  Consider "parse" a placeholder
> for the concept of a plain old ordinary BNF rule.

I agree they should be split, but perhaps it's "rules in general" that
should be renamed, since plain old ordinary BNF has laid claim to "rule"
for several decades now? Perhaps we need to bow to historical (rather
than etymological) usage on "regex" too, yielding:

 KeywordImplicit adverbsBehaviour

  regex (none)  Ignores whitespace, backtracks
  token :ratchetIgnores whitespace, no backtracking
  rule  :ratchet :words Skips whitespace, no backtracking

Using C and C as the typical grammar components would make
Perl 6 grammars *much* more accessible to those already familiar with
grammar-based parsing. And using C for "plain old backtracking regular 
expressions" would make them much more accessible to those already familiar 
with Perl 5 regexes.


Damian


Re: TODO tests and test::harness

2006-04-20 Thread Steve Peters
On Thu, Apr 20, 2006 at 10:36:08PM +0200, demerphq wrote:
> On 4/20/06, Steve Peters <[EMAIL PROTECTED]> wrote:
> > Maybe I'm thinking too hard, or maybe the results reported aren't
> > exactly as clear as they probably should be.  Here's an example test and
> > its results as reported by Test::Harness with the TODO changes.
> >
> > #!perl -w
> >
> > use strict;
> > use Test::More qw(no_plan);
> >
> > TODO: {
> > local $TODO = "TODO testing";
> > is(1, 2, "A failing test");
> > is(1, 1, "A passing test");
> > }
> > [EMAIL PROTECTED]:~/smoke/perl-current/t$ ./perl harness th_test.t
> > th_testok
> > 1/2 unexpectedly succeeded
> > TODO PASSED tests 1-2
> >
> > All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED).
> > Passed Test Stat Wstat Total Pass  Passed  List of Passed
> > ---
> > th_test.t  21  50.00%  1-2
> > Files=1, Tests=2,  0 wallclock secs ( 0.11 cusr +  0.01 csys =  0.12
> > CPU)
> >
> > The line starting TODO PASSED shows all TODO tests, not those that
> > unexpectedly succeeded, which confused me a bit.  Also, the final
> > results show that one test passed, but then the list of passed is "1-2"
> > instead of just "2" which is the unexpected success.  Is there a way to
> > have the list of passed just show the unexpected successes?
> 
> Attached patch should fix it up. Both in terms of making it clearer
> and of fixing the list. So your test file would look like:
> 
> All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED), 37 subtests skipped.
> Passed Todo  Stat Wstat Todos Pass  Passed  List of Passed
> ---
> t/demerphq.t21  50.00%  3
> Files=19, Tests=572,  8 wallclock secs ( 0.00 cusr +  0.00 csys =  0.00 CPU)
> 
> Hopefully thats clearer. The "Todos" column shows how many todos there
> are in the file.
> 

Excellent!  It seems to be working more as I was hoping to see.  This
patch was applied as change #27925.

Steve Peters
[EMAIL PROTECTED]


signature.asc
Description: Digital signature


Re: TODO tests and test::harness

2006-04-20 Thread demerphq
On 4/20/06, Steve Peters <[EMAIL PROTECTED]> wrote:
> Maybe I'm thinking too hard, or maybe the results reported aren't
> exactly as clear as they probably should be.  Here's an example test and
> its results as reported by Test::Harness with the TODO changes.
>
> #!perl -w
>
> use strict;
> use Test::More qw(no_plan);
>
> TODO: {
> local $TODO = "TODO testing";
> is(1, 2, "A failing test");
> is(1, 1, "A passing test");
> }
> [EMAIL PROTECTED]:~/smoke/perl-current/t$ ./perl harness th_test.t
> th_testok
> 1/2 unexpectedly succeeded
> TODO PASSED tests 1-2
>
> All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED).
> Passed Test Stat Wstat Total Pass  Passed  List of Passed
> ---
> th_test.t  21  50.00%  1-2
> Files=1, Tests=2,  0 wallclock secs ( 0.11 cusr +  0.01 csys =  0.12
> CPU)
>
> The line starting TODO PASSED shows all TODO tests, not those that
> unexpectedly succeeded, which confused me a bit.  Also, the final
> results show that one test passed, but then the list of passed is "1-2"
> instead of just "2" which is the unexpected success.  Is there a way to
> have the list of passed just show the unexpected successes?

Attached patch should fix it up. Both in terms of making it clearer
and of fixing the list. So your test file would look like:

All tests successful (1 subtest UNEXPECTEDLY SUCCEEDED), 37 subtests skipped.
Passed Todo  Stat Wstat Todos Pass  Passed  List of Passed
---
t/demerphq.t21  50.00%  3
Files=19, Tests=572,  8 wallclock secs ( 0.00 cusr +  0.00 csys =  0.00 CPU)

Hopefully thats clearer. The "Todos" column shows how many todos there
are in the file.

Cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Only in Test-Harness: Makefile.old
diff -wurd Test-Harness-2.57_05/lib/Test/Harness.pm Test-Harness/lib/Test/Harness.pm
--- Test-Harness-2.57_05/lib/Test/Harness.pm	2006-04-19 07:25:51.0 +0200
+++ Test-Harness/lib/Test/Harness.pm	2006-04-20 22:30:18.102615400 +0200
@@ -39,6 +39,7 @@
 =cut
 
 $VERSION = "2.57_05";
+$VERSION = eval $VERSION;
 
 # Backwards compatibility for exportable variable names.
 *verbose  = *Verbose;
@@ -352,7 +353,7 @@
 # state of the current test.
 my @failed = grep { !$results{details}[$_-1]{ok} }
  [EMAIL PROTECTED];
-my @todo_pass = grep { $results{details}[$_-1]{ok} &&
+my @todo_pass = grep { $results{details}[$_-1]{actual_ok} &&
$results{details}[$_-1]{type} eq 'todo' }
 [EMAIL PROTECTED];
 
@@ -362,6 +363,7 @@
 max => $results{max},
 failed  => [EMAIL PROTECTED],
 todo_pass   => [EMAIL PROTECTED],
+todo=> $results{todo},
 bonus   => $results{bonus},
 skipped => $results{skip},
 skip_reason => $results{skip_reason},
@@ -384,14 +386,14 @@
 push(@msg, "$test{skipped}/$test{max} skipped: $test{skip_reason}")
 if $test{skipped};
 if ($test{bonus}) {
-my ($txt, $canon) = _canondetail($test{max},$test{skipped},'TODO passed',
+my ($txt, $canon) = _canondetail($test{todo},0,'TODO passed',
 @{$test{todo_pass}});
 $todo_passed{$tfile} = {
 canon   => $canon,
-max => $test{max},
+max => $test{todo},
 failed  => $test{bonus},
 name=> $tfile,
-percent => 100*$test{bonus}/$test{max},
+percent => 100*$test{bonus}/$test{todo},
 estat   => '',
 wstat   => '',
 };
@@ -568,7 +570,7 @@
 if (_all_ok($tot)) {
 $out .= "All tests successful$bonusmsg.\n";
 if ($tot->{bonus}) {
-my($fmt_top, $fmt) = _create_fmts("Passed",$todo_passed);
+my($fmt_top, $fmt) = _create_fmts("Passed Todo",$todo_passed);
 # Now write to formats
 for my $script (sort keys %{$todo_passed||{}}) {
 my $Curtest = $todo_passed->{$script};
@@ -593,7 +595,7 @@
   $tot->{max} - $tot->{ok}, $tot->{max}, 
   $percent_ok;
 
-my($fmt_top, $fmt1, $fmt2) = _create_fmts("Failed",$failedtests);
+my($fmt_top, $fmt1, $fmt2) = _create_fmts("Failed Test",$failedtests);
 
 # Now write to formats
 for my $script (sort keys %$failedtests) {
@@ -767,12 +769,13 @@
 
 
 sub _create_fmts {
-my $type = shift;
+my $failed_str = shift;
 my $

[svn:perl6-synopsis] r8891 - doc/trunk/design/syn

2006-04-20 Thread larry
Author: larry
Date: Thu Apr 20 17:01:01 2006
New Revision: 8891

Modified:
   doc/trunk/design/syn/S05.pod

Log:
As per Damian++'s suggestion, regex is now base form and rule is specialized.
(Note: subrules are still called subrules, not subregexes.)
The .matches method has been unified with multidimensional capture.
Clarified captures and hash key shortening as discussed with Patrick++
Clarified some ignorecase-ness of interpolations.
Reworded section Daniel++ misliked.
Worked over the spelling some.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podThu Apr 20 17:01:01 2006
@@ -15,12 +15,12 @@
Date: 24 Jun 2002
Last Modified: 20 Apr 2006
Number: 5
-   Version: 16
+   Version: 17
 
 This document summarizes Apocalypse 5, which is about the new regex
-syntax.  We now try to call them "rules" because they haven't been
-regular expressions for a long time.  (The term "regex" is still
-acceptable.)
+syntax.  We now try to call them "regex" because they haven't been
+regular expressions for a long time.  When referring to their use in
+a grammar, the term "rule" is preferred.
 
 =head1 New match state and capture variables
 
@@ -136,7 +136,7 @@
 
  m:p/.*? <( pattern )> /
 
-Also note that any rule called as a subrule is implicitly anchored to the
+Also note that any regex called as a subrule is implicitly anchored to the
 current position anyway.
 
 =item *
@@ -166,7 +166,7 @@
 the right thing.  If not, define your own C<<  >> and C<:w> will use that.
 
 In general you don't need to use C<:w> within grammars because
-the parse rules automatically handle whitespace policy for you.
+the parser rules automatically handle whitespace policy for you.
 
 =item *
 
@@ -234,7 +234,7 @@
 
 =item *
 
-With the new C<:ov> (C<:overlap>) modifier, the current rule will
+With the new C<:ov> (C<:overlap>) modifier, the current regex will
 match at all possible character positions (including overlapping)
 and return all matches in a list context, or a disjunction of matches
 in a scalar context.  The first match at any position is returned.
@@ -242,12 +242,12 @@
  $str = "abracadabra";
 
  if $str ~~ m:overlap/ a (.*) a / {
- @substrings = $/.matches();# bracadabr cadabr dabr br
+ @substrings = @;();# bracadabr cadabr dabr br
  }
 
 =item *
 
-With the new C<:ex> (C<:exhaustive>) modifier, the current rule will match
+With the new C<:ex> (C<:exhaustive>) modifier, the current regex will match
 every possible way (including overlapping) and return all matches in a list
 context, or a disjunction of matches in a scalar context.
 
@@ -266,7 +266,7 @@
 
 =item *
 
-The new C<:rw> modifier causes this rule to "claim" the current
+The new C<:rw> modifier causes this regex to "claim" the current
 string for modification rather than assuming copy-on-write semantics.
 All the bindings in C<$/> become lvalues into the string, such
 that if you modify, say, C<$1>, the original string is modified in
@@ -277,22 +277,22 @@
 
 =item *
 
-The new C<:keepall> modifier causes this rule and all invoked subrules
+The new C<:keepall> modifier causes this regex and all invoked subrules
 to remember everything, even if the rules themselves don't ask for
 their subrules to be remembered.  This is for forcing a grammar that
 throws away whitespace and comments to keep them instead.
 
 =item *
 
-The new C<:ratchet> modifier causes this rule to not backtrack by default.
+The new C<:ratchet> modifier causes this regex to not backtrack by default.
 (Generally you do not use this modifier directly, since it's implied by
-C and C declarations.)  The effect of this modifier is
+C and C declarations.)  The effect of this modifier is
 to imply a C<:> after every construct that could backtrack, including
 bare C<*>, C<+>, and C quantifiers, as well as alternations.
 
 =item *
 
-The new C<:panic> modifier causes this rule and all invoked subrules
+The new C<:panic> modifier causes this regex and all invoked subrules
 to try to backtrack on any rules that would otherwise default to
 not backtracking because they have C<:ratchet> set.  Never panic
 unless you're desperate and want the pattern matcher to do a lot of
@@ -302,7 +302,7 @@
 =item *
 
 The C<:i>, C<:w>, C<:Perl5>, and Unicode-level modifiers can be
-placed inside the rule (and are lexically scoped):
+placed inside the regex (and are lexically scoped):
 
  m/:w alignment = [:i left|right|cent[er|re]] /
 
@@ -428,7 +428,7 @@
 
 =item *
 
-You can call Perl code as part of a rule match by using a closure.
+You can call Perl code as part of a regex match by using a closure.
 Embedded code does not usually affect the match--it is only used
 for side-effects:
 
@@ -482,11 +482,11 @@
 
 =item *
 
-In Perl 6 rules, variables don't interpolate.
+In Perl 6 regexes, variables don't interpolate.
 
 =

Re: [svn:perl6-synopsis] r8891 - doc/trunk/design/syn

2006-04-20 Thread Ruud H.G. van Tol
[EMAIL PROTECTED] schreef:

> @@ -266,7 +266,7 @@
>
>  =item *
>
> -The new C<:rw> modifier causes this rule to "claim" the current
> +The new C<:rw> modifier causes this regex to "claim" the current
>  string for modification rather than assuming copy-on-write semantics.

There are about eight uses of "rather than", not all seem strong enough
to me: they leave room for "the other thing".
This says it better: http://www.bartleby.com/64/C002/006.html
I don't think that "instead of" should always rather be used, often "and
not" or "so it doesn't" will be better.

Among the eight is a "rather" with a missing "than". There is also a
"rather easier", that one must be good.

-- 
Groet, Ruud



[svn:perl6-synopsis] r8893 - doc/trunk/design/syn

2006-04-20 Thread autrijus
Author: autrijus
Date: Thu Apr 20 23:49:15 2006
New Revision: 8893

Modified:
   doc/trunk/design/syn/S05.pod

Log:
Stylistic cleanup of S05; no functional changes.

* s/TimToady/Larry Wall/

* Consistently change "foo" to C or I to be consistent
  with context.

* Fixed the "state $x ||= /.../" example, which will cause rematch
  on matchfail.  "state $x //= /.../" would be the correct form.

* Clarified that only Int or Range objects can sensibly be used
  as quantifier range; matching something "3.5+6i" times wouldn't
  quite make sense.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podThu Apr 20 23:49:15 2006
@@ -11,16 +11,17 @@
 
 =head1 VERSION
 
-   Maintainer: Patrick Michaud <[EMAIL PROTECTED]> (& TimToady)
+   Maintainer: Patrick Michaud <[EMAIL PROTECTED]> and
+   Larry Wall <[EMAIL PROTECTED]>
Date: 24 Jun 2002
Last Modified: 20 Apr 2006
Number: 5
-   Version: 17
+   Version: 18
 
 This document summarizes Apocalypse 5, which is about the new regex
-syntax.  We now try to call them "regex" because they haven't been
+syntax.  We now try to call them I because they haven't been
 regular expressions for a long time.  When referring to their use in
-a grammar, the term "rule" is preferred.
+a grammar, the term I is preferred.
 
 =head1 New match state and capture variables
 
@@ -126,7 +127,7 @@
 
 Since this is implicitly anchored to the position, it's suitable for
 building parsers and lexers.  The pattern you supply to a Perl macro's
-"is parsed" trait has an implicit C<:p> modifier.
+C trait has an implicit C<:p> modifier.
 
 Note that
 
@@ -266,7 +267,7 @@
 
 =item *
 
-The new C<:rw> modifier causes this regex to "claim" the current
+The new C<:rw> modifier causes this regex to I the current
 string for modification rather than assuming copy-on-write semantics.
 All the bindings in C<$/> become lvalues into the string, such
 that if you modify, say, C<$1>, the original string is modified in
@@ -394,8 +395,8 @@
 
 =item *
 
-C<.> matches an "anything", while C<\N> matches an "anything except
-newline". (The C modifier is gone.)  In particular, C<\N> matches
+C<.> matches an I, while C<\N> matches an I. (The C modifier is gone.)  In particular, C<\N> matches
 neither carriage return nor line feed.
 
 =item *
@@ -451,7 +452,7 @@
 The repetition specifier is now C<**{...}> for maximal matching,
 with a corresponding C<**{...}?> for minimal matching.  Space is
 allowed on either side of the asterisks.  The curlies are taken to
-be a closure returning a number or a range.
+be a closure returning an Int or a Range object.
 
  / value was (\d ** {1..6}?) with ([\w]**{$m..$n}) /
 
@@ -459,7 +460,7 @@
 
  / [foo]**{1,3} /
 
-(At least, it fails in the absence of "C",
+(At least, it fails in the absence of C,
 which is likely to be unimplemented in Perl 6.0.0 anyway).
 
 The optimizer will likely optimize away things like C<**{1...}>
@@ -471,7 +472,7 @@
 
 =item *
 
-C<< <...> >> are now extensible metasyntax delimiters or "assertions"
+C<< <...> >> are now extensible metasyntax delimiters or I
 (i.e. they replace Perl 5's crufty C<(?...)> syntax).
 
 =back
@@ -486,7 +487,7 @@
 
 =item *
 
-Instead they're passed "raw" to the regex engine, which can then decide
+Instead they're passed I to the regex engine, which can then decide
 how to handle them (more on that below).
 
 =item *
@@ -520,7 +521,7 @@
 As with a scalar variable, each element is matched as a literal
 unless it happens to be a Regex object, in which case it is matched
 as a subrule.  As with scalar subrules, a tainted subrule always fails.
-All values pay attention to the current C<:ignorecase> setting
+All values pay attention to the current C<:ignorecase> setting.
 
 =item *
 
@@ -539,7 +540,7 @@
 
 If the value is a string, it is matched literally, starting after where
 the key left off matching.  As a natural consequence, if the value is
-"", nothing special happens except that the key match succeeds.
+C<"">, nothing special happens except that the key match succeeds.
 
 =item *
 
@@ -669,7 +670,7 @@
 internally that turns into a hash lookup.)
 
 As with bare hash, the longest key matches according to the venerable
-"longest token rule", but in addition, you may combine multiple hashes
+I, but in addition, you may combine multiple hashes
 under the same longest-token consideration like this:
 
 <%statement|%prefix|%term>
@@ -761,10 +762,10 @@
 
 /  \d+  /
 
-except that the scan for "foo" can be done in the forward direction,
+except that the scan for "C" can be done in the forward direction,
 while a lookbehind assertion would presumably scan for C<\d+> and then
 match "C" backwards.  The use of C<< <(...)> >> affects only the
-meaning of the "result object" and the positions of the beginning and
+meaning of the I and the positions of th