Re: Hackathon notes

2005-07-08 Thread Patrick R. Michaud
On Thu, Jul 07, 2005 at 06:37:58PM +0800, Autrijus Tang wrote:
> During the Pugs Hackathon at YAPC::NA 2005, I managed to get various
> unspecced tests and features reviewed by Larry, and posted them in my
> journal.  The original notes is attached; I'd be very grateful if you or
> other p6l people can find tuits to work them back into the relevant
> Synopses. :-)

I'll be glad to work on it, yes, and thanks for sending it.

I would definitely appreciate any help that other p6l folks can provide
in putting these into an appropriate form for the Synopses.

Pm


Re: Proposal: split ternary ?? :: into binary ?? and //

2005-09-06 Thread Patrick R. Michaud
On Tue, Sep 06, 2005 at 07:26:37AM +1000, Damian Conway wrote:
> Thomas Sandlass wrote:
> 
> >I'm still contemplating how to get rid of the :: in the
> >ternary 
> >
> >Comments?
> I believe that the single most important feature of the ternary operator is 
> that it is ternary. That is, unlike an if-else sequence, it's impossible to 
> leave out the "else" in a ternary operation. Splitting the ternary destroys 
> that vital characteristic, which would be a very bad outcome.

At OSCON I was also thinking that it'd be really nice to get rid of 
the :: in the ternary and it occurred to me that perhaps we could use
something like '?:' as the 'else' token instead:

   (cond) ??  (if_true) ?: (if_false)

However, I'll freely admit that I hadn't investigated much further
to see if this might cause other syntax ambiguities.

Pm


Re: Proposal: split ternary ?? :: into binary ?? and //

2005-09-07 Thread Patrick R. Michaud
On Wed, Sep 07, 2005 at 08:32:39AM -0700, Larry Wall wrote:
> 
> I think that's a powerful argument even if we don't have an infix:<::>.
> Plus I hate all infix "nor" operators due to my English-speaking bias
> that requires a "neither" on the front.  So let's go ahead and make
> it ??!!.  (At least this week...)

Yay !!

Pm


Re: syntax for accessing multiple versions of a module

2005-10-21 Thread Patrick R. Michaud
On Thu, Oct 20, 2005 at 09:14:15PM -0400, John Adams wrote:
> From: Luke Palmer <[EMAIL PROTECTED]>
> 
> > But $1 in Perl 5 wasn't the same as $1 in a shell script.
> 
> I'm all for breaking things that need breaking, which is why I 
> keep my mouth shut most of the time--either I see the reason or 
> I suspect (that is, take on faith, which is okay by me) there's 
> a reason I don't see or fully understand. I'm just not seeing a 
> compelling reason for this one, and a pretty good reason not to do it: 

I can state the compelling reason for this one -- it's way too 
confusing when $1, $2, $3, etc. correspond to $/[0], $/[1], $/[2], etc.

In many discussions of capturing semantics earlier in the year, 
nearly everyone using $1, $2, $3 in examples, documentation, and 
discussion was having trouble with off-by-one errors.  This includes
the language designers, and even those who were advocating staying
with $1, $2, $3.  Once we switched to using $0, $1, $2, etc., 
nearly all of the confusion and mistakes disappeared.

> I'm not aware offhand of any other place where $0 is used in 
> regex matching, and several of the languages which you point out 
> are zero-based in other places are not zero-based in regex matching.

Yes, but none of those other regex matching languages do nested
captures either.  In particular, a rule like:

/:w ( (\w+) = (\d+) ; )+ /

no longer captures to $1, $2, $3, or even to $0, $1, $2.  It now
creates an array in $/[0] (aka $0), and each element of that array 
contains a [0] and [1] index representing the second and third set of 
parentheses in the rule.  That is

"a=4; b=2; c=8;" ~~ /:w ( (\w+) = (\d+) ; )+ /

results in

$/[0][0][0] == 'a'   $/[0][0][1] == '4'
$/[0][1][0] == 'b'   $/[0][1][1] == '2'
$/[0][2][0] == 'c'   $/[0][2][1] == '8'

Trying to make *all* of these indexes 1-based leads to 
chaos (especially wrt array assignment), and saying that top
level parens in a rule are named $1, $2, $3, ... while nested parens 
are named [0], [1], [2], ... just throws everything and
everyone off.  It's *much* easier when everything is zero-based,
even for those who are used to using $1, $2, $3 in regular
expressions.

Pm


Re: \x{123a 123b 123c}

2005-11-20 Thread Patrick R. Michaud
On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
> On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote:
> : Ruud H.G. van Tol skribis 2005-11-20  1:19 (+0100):
> : > Maybe 
> : > "\x{123a 123b 123c}" 
> : > is a nice alternative of 
> : > "\x{123a} \x{123b} \x{123c}". 
> 
> We already have, from A5, \x[0a;0d], so you can supposedly say 
> "\x[123a;123b;123c]" 

Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
in S05, so I should probably add it, or whatever syntax we end up 
adopting.

(BTW, we haven't announced it on p6l yet, but there's a new version of
S05 available.)

> [...]
> But I see that the semicolon is rather cluttery, mainly because it's
> too tall.  I'm not sure going all the way to space is good, but we
> might have
> "\x[123a,123b,123c]" 
> just to get a little visual space along with the separator.  

Just to verify, with this syntax would we expect

\x[123a,123b,123c]+

to be the same as

[\x123a \x123b \x123c]+

and not "\x123a \x123b \x123c+" ?

> It occurs to me that we didn't spec whether character classes ignore
> whitespace.  They probably should, just so you can chunk things:
> 
> / <[ a..z A..Z 0..9 _ ]> /
> 
> Then the question arises about whether <[ \ ]> is an escaped space
> or a backslash, or illegal  

I vote that it's an escaped space.  A backslash is nearly always \\
(or should be imho).

> But if we make it match a backslash
> or illegal, then the minimal space matcher becomes \x20, I think,
> unless you graduate to \s.  On the other hand, if we make it match
> a space, people aren't going to read that way unless they're pretty
> sophisticated...

There's also , unless someone redefines the  subrule.
And in the general case that's a slightly more expensive mechanism 
to get a space (it involves at least a subrule lookup).  Perhaps 
we could also create a visible meta sequence for it, in the same 
way that we have visible metas for \e, \f, \r, \t.  But I have 
no idea what letter we might use there.

I don't think I like this, but perhaps  C<< <> >> becomes  
and C<< < > >> becomes <' '>?  Seems like not enough visual distinction
there...

Pm


Re: \x{123a 123b 123c}

2005-11-21 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 03:23:35PM +0100, TSa wrote:
> Patrick R. Michaud wrote:
> >There's also , unless someone redefines the  subrule.
> >And in the general case that's a slightly more expensive mechanism 
> >to get a space (it involves at least a subrule lookup).  Perhaps 
> >we could also create a visible meta sequence for it, in the same 
> >way that we have visible metas for \e, \f, \r, \t.  But I have 
> >no idea what letter we might use there.
> 
> How about \x and \X respectively? Note the *space* after it :)
> ...

If we're going to do that, I'd think it would be "\c " and "\C " 
instead of "\x " and "\X ".  I'm not really advocating this,
I'm just commenting that in this case \c seems more natural 
than \x.

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 12:08:08PM -0800, Larry Wall wrote:
> On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
> : There is a "[[:alpha:][:digit:]" and a "[[:alpha:][:digit]]" on the
> : A5-page.
> 
> Hmm, well, thanks--I went to fix it and I see Patrick beat me to
> the fix.  But in one of the updates, it says:
> 
> +[Update: Actually, that's now written C<< <+alpha+digit> >>, avoiding
> +the mistaken impression entirely.]

I went ahead and added the update while fixing the typos.  :-)

> And it occurs to me that we could probably allow  there
> since there's no ambiguity what  the next character after the opening word to decide how to process the
> rest of the text inside angles.  Even if someone writes
> 
> 
> 
> that would fail under the current policy of treating "+ digit" as rule,
> since you can't start a rule with +.

Somehow I prefer the explicit leading + or -, so that we *know* this
is a rule composition of some sort.  It also fits in well with the
convention that the first character after the '<' lets you know
what kind of assertion is being created.

> Unfortunately, though,
> 
> 
> 
> would be ambiguous, and/or wrong.  Could allow whitespace there if we
> picked an explicit "this is rule" character.  Did we remove "this is
> string"?  

I didn't recall seeing anything that removed "this is string", so it's
currently implemented in PGE.  It's kind of a nice shortcut:



but it would be no real problem to eliminate it and go
strictly with:



"This is rule" is currently whitespace, whatever follows is taken to be
a pattern.

But let me know what you decide so I can make the appropriate
changes.  :-)

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
> : There's also , unless someone redefines the  subrule.
> 
> But you can't use  in a character class.  Well, that is, unless
> you write it:
> 
> <+[ a..z ]+>
> 
> or some such.  Maybe that's good enough.

Er, that's now <+[ a..z ]+sp>, unless you're now changing it back.

> : And in the general case that's a slightly more expensive mechanism 
> : to get a space (it involves at least a subrule lookup).  Perhaps 
> : we could also create a visible meta sequence for it, in the same 
> : way that we have visible metas for \e, \f, \r, \t.  But I have 
> : no idea what letter we might use there.
> 
> Something to be said for \_ in that regard.

Yes, I thought of \_ but mentally I still have trouble 
classifying "_" along with the alphabetics -- '_' looks more
like punctuation to me.  And in general we use backslashes
in front of metacharacters to remove their meta meaning
(or when we aren't sure if a character has a meta meaning),
so that \_ somehow seems like it ought to be a literal
underscore, guarding against the possibility that the unescaped
underscore has a meta meaning.  (And yes, I can shoot
holes in this line of thinking along with everyone else.)

Whatever shortcuts we introduce, I'll be happy if we can just
rule that backslash+space (i.e., "\ ") is a literal space
character -- i.e., keeping the principle that placing a backslash
in front of a metacharacter removes that character's "meta"
behavior.

> I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
> rules does filename globbing or some such.  I can see some issues with
> anchoring semantics.  Makes more sense on a string as a whole, but maybe
> can anchor on element boundaries if used on a list of filenames.
> I suppose one could even go as far as
> 
> rule jpeg :i « *.jp{e,}g »
> 
> or whatever the right glob syntax is.

Since we already have :perl5, I'd think that we'd want globbing 
to be something like

rule jpeg :i :glob /*.jp{e,}g/

or, for something intra-rule-ish:

m :w / mv (:glob *.c)+  /

And perhaps we'd want a general form for specifying other 
pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
:syntax('perl5') and :syntax('glob') or something like that.

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 07:57:59PM +0100, Ruud H.G. van Tol wrote:
> 
> There is a "[[:alpha:][:digit:]" and a "[[:alpha:][:digit]]" on the
> A5-page.

Now fixed.

> > Besides, you have to be able to distinguish
> > s/^/foo/ from s/$/foo/.
> 
> 's/$/foo/' becomes 's//foo/'
> 

Uh, no, because  is still a zero width assertion.  :-)

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 11:19:48PM +0100, Ruud H.G. van Tol wrote:
> Patrick R. Michaud:
> 
> >> 's/$/foo/' becomes 's//foo/'
> >> 
> > 
> > Uh, no, because  is still a zero width assertion.  :-)
> 
> That's why I chose it. It is not at the end-of-string?

Because ".*" matches "", // would be true at 
every position in the string, including the beginning,
and this is where "foo" would be substituted.  

Pm


Re: apo5

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 01:09:40AM +0100, Ruud H.G. van Tol wrote:
>  's/$/foo/' becomes 's//foo/'
> >>>
> >>> Uh, no, because  is still a zero width assertion.  :-)
> >>
> >> That's why I chose it. It is not at the end-of-string?
> >
> > Because ".*" matches "", // would be true at
> > every position in the string, including the beginning,
> > and this is where "foo" would be substituted.
> 
> I expected greediness, also because  could behave non-greedy.
> ...
> But why does  behave non-greedy?

I think you may be misreading what  does -- it's a lookbehind
assertion.  An assertion such as  attempts to match
pattern to the sequence immediately preceding the current match position.
It does not mean "skip over pattern and then match whatever comes
afterwards".

The greediness of the .* subpattern in  doesn't affect
things at all --  is still a zero-width assertion.
Since ".*" can match at every position,  will be
a successful zero-width match (i.e., a null string) at every
position in the target string, including the beginning.

So, s//foo/  matches the first null string it finds 
-- the one at the beginning of the string -- and replaces it 
with "foo".  It's the same as if you had written s//foo/,
since  and  will both end up matching exactly
the same (i.e., a zero-width string at any position).

If this still doesn't make any sense, contact me off-list and
I'll try and explain it there.

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 07:52:24AM -0800, Larry Wall wrote:
> 
> I think we'll leave both _ and \_ meaning the same thing, just to avoid
> that confusion path [...]

Yay!

> : Whatever shortcuts we introduce, I'll be happy if we can just
> : rule that backslash+space (i.e., "\ ") is a literal space
> : character -- i.e., keeping the principle that placing a backslash
> : in front of a metacharacter removes that character's "meta"
> : behavior.
> 
> Yes, that will be a space.

Yay!

> : Since we already have :perl5, I'd think that we'd want globbing 
> : to be something like
> : rule jpeg :i :glob /*.jp{e,}g/
> : or, for something intra-rule-ish:
> : m :w / mv (:glob *.c)+  /
> 
> Yep, that's what I decided in my other message that was thinking about
> using < ... > for word boundaries and << ... >> for capturing $<>.

Yay! (Our messages on this crossed in the mail; mine was moderated for
some reason but that's been corrected.)

> : And perhaps we'd want a general form for specifying other 
> : pattern syntaxes; i.e., :perl5 and :glob are shortcuts for
> : :syntax('perl5') and :syntax('glob') or something like that.
> 
> Maybe.  Or maybe it's enough that there are syntactic categories for
> adding rule modifiers.  Doesn't seem like you'd want to parameterize
> the current language very often.

At least within PGE, I'm starting to come across the situation
where each application and host language wants its own slight variations
of the regular expression syntax (for compatibility reasons).
And I figured that since we (conjecturally) have C<:lang('PIR')>, 
C<:lang('Python')> and C<:lang('TCL')> to indicate the language 
to be used for the closures within a rule, it might be nice to 
have a similar parameterized modifier for the pattern syntax
itself.

I was also thinking that one of the tricky parts to custom rule
modifiers such as :perl and :glob is that they actually change
the parsing for whatever follows, so it might be nice to have
a parameterized form to hook into rather than defining a custom
modifier for each syntax variant.  But on thinking about it 
further from an implementation perspective I guess it all comes 
out the same anyway...

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Tue, Nov 22, 2005 at 10:30:20AM -0800, Larry Wall wrote:
> On Tue, Nov 22, 2005 at 09:46:59AM -0800, Dave Whipp wrote:
> : Larry Wall wrote:
> : 
> : >And there aren't that many regexish languages anyway.  So I think :syntax
> : >is relatively useless except for documentation, and in practice people
> : >will almost always omit it, which makes it even less useful, and pretty
> : >nearly kicks it over into the category of multiplied entities for me.
> : 
> : Its surprising how many are out there.
> 
> We can certainly add a :syntax() modifier as easily as a :foolang modifier,
> if we decide at some point we really need one, or if PGE could make good
> use of it even if Perl 6 doesn't want it.

I'm agreeing with Larry on this one -- let's wait to decide this 
until we actually feel like we need it.

Pm


Re: \x{123a 123b 123c}

2005-11-22 Thread Patrick R. Michaud
On Mon, Nov 21, 2005 at 09:02:57AM -0800, Larry Wall wrote:
> On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
> : On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
> : > We already have, from A5, \x[0a;0d], so you can supposedly say 
> : > "\x[123a;123b;123c]" 
> : 
> : Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
> : in S05, so I should probably add it, or whatever syntax we end up 
> : adopting.
> 
> Yes.

Out of curiosity (and so I can update S05 and PGE), what syntax 
are we adopting?  Is it semicolon, comma, space, any combination of the 
three, or ...?

Pm


Re: Matching a literal # in a rule

2005-12-02 Thread Patrick R. Michaud
On Fri, Dec 02, 2005 at 09:26:12PM +0100, Brad Bowman wrote:
> How can you match a literal "#" in a rule?
>  \# or only \x{23}?
> 
> S05 seems clear "# now always introduces a comment",
> and \# is not listed in the escapes.
> 
> But then Perl 5 has \# so I assume it's just an omission...

Short answer:  \# matches a literal '#'.  (So does <'#'>.)

Longer answer:  I think "always" may be too strongly worded
in S05, it's not meant as an absolute but rather it's contrasting
perl 6 expressions from perl 5 ones (as part of the "because /x
is default" above).  

For example, a few lines earlier S05 says that "^ and $ now 
always match the start/end of a string", but the "always" here 
is mean to distinguish perl 6 from perl 5, where ^ and $ could 
have different meanings depending on the /m option.  Similarly,
in perl 5 a '#' could have different meanings depending on the /x
option, but in perl 6 it is always a metacharacter and introduces
a comment.  To get a literal # you can escape it with a backslash.

Pm


Re: Match objects

2005-12-26 Thread Patrick R. Michaud
On Fri, Dec 23, 2005 at 02:09:19PM +, Luke Palmer wrote:
> What sort of match object should this return, supposing that it didn't
> infinite loop:
> 
> "x" ~~ / [ [ (x) ]* ]* /
> 
> Should $/[0][0] be "x", or should $/[0][0][0] be "x"?  If it's the
> latter, then when do new top-level elements get added?

As I understand things, $/[0][0] would be "x".

FWIW, PGE currently creates top-level elements as they are 
encountered in the pattern match.

Pm


Re: Match objects

2005-12-26 Thread Patrick R. Michaud
On Mon, Dec 26, 2005 at 07:34:06PM +, Luke Palmer wrote:
> On 12/26/05, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> > On Fri, Dec 23, 2005 at 02:09:19PM +, Luke Palmer wrote:
> > > "x" ~~ / [ [ (x) ]* ]* /
> >
> > As I understand things, $/[0][0] would be "x".
> 
> Hmm, that seems wrong.  Consider:
> 
> "xxxyxxyxy" ~~ / [ [ (x) ]* (y) ]* /
> 
> I argue that by the structure of that rule, you should be able to tell
> which xs go with which y.  
> ...
> Is there a counterargument that I'm not seeing?

I'd say that if you want a structured rule, it should be written
that way, as in

( (x)* y )*

or 

( (x)* (y) )*

Then it's easy to know which xs go with which y.  As I play more
with rules, it seems that parens are the way to build/preserve
structure in captures, while square brackets are the way to flatten
or ignore it.

Pm


Re: ff and fff [Was: till (the flipflop operator, formerly ..)]

2006-01-25 Thread Patrick R. Michaud
On Wed, Jan 25, 2006 at 11:37:42AM -0800, Larry Wall wrote:
> I've changed the flipflop operator/macro to "ff", short for "flipflop".
> This has several benefits.  ...

...another of which is that we can use "ff" and "fff" to mean "loud" 
and "really loud" in our perl poetr^H^H^H^H^Hmusic.  :-)

Pm



Re: Is S05 correct?

2006-02-06 Thread Patrick R. Michaud
On Mon, Feb 06, 2006 at 08:29:54PM -0500, Joe Gottman wrote:
>This may be a stupid question, but where can I view the fixed Synopsis?
> When I go to http://dev.perl.org/perl6/doc/design/syn/S05.html, I see that
> the modification date is November 16, 2005. Is this the most up-to-date
> version?

Essentially, yes.  There have been a few corrections since Nov 16
to some typographical errors (for which none of the committers felt
was worth updating the modification date), but nothing substantial
has changed in S05 since then.

Pm


Re: Implementation of :w in regexes and other regex questions

2006-02-14 Thread Patrick R. Michaud
On Tue, Feb 14, 2006 at 11:35:18AM -0800, David Romano wrote:
> On 2/14/06, Luke Palmer <[EMAIL PROTECTED]> wrote:
> > On 2/14/06, David Romano <[EMAIL PROTECTED]> wrote:
> > > I don't want to just skip  tags wholly, because they do 
> > > serve a purpose, but only in a particular context. (Can  
> > > be changed back to a "default" if
> > > changed to include html tags?)
> >
> > Brackets serve as a kind of scoping for modifiers.  We're also
> > considering that :ws take an argument telling it what to consider to
> > be whitespeace.  So you could do:
> >
> > rule Month :w {
> > [ :w(&my_ws) J a n ] # not sure about the &
> > # out here we still have the default :w
> > }
> Ahh, okay. So am I to understand that my_ws would just return a set of
> individual characters or character sequences that would be considered
> whitespace? Or would my_ws do something else?

I would think that my_ws would be a rule of some sort:

rule my_ws { [ \s+ | \< /? b \> ]* }

Also, it wasn't noted in the previous post, but one can 
explicitly call the "default" ws rule by referring to it explicitly,
as in .  (Currently PGE has it as .)
So, presumably one could do

   rule Month :w(&my_ws) { J a n   # my_ws rule here
  [:w(&Rule::ws) . . . ]   # "default" ws rule here
  [:w(0) . . . ]   # no :w here

PGE doesn't yet implement rule arguments to the :w modifier, but
I bet we can add it without too much trouble.  :-)

Pm


Re: some newbie questions about synopsis 5

2006-02-15 Thread Patrick R. Michaud
On Wed, Feb 15, 2006 at 10:09:05AM +0100, H. Stelling wrote:
> - Capture numbering:
> 
> /(a) [ (b) (c) (d) | (e) & (f) ] (g)/ capture.t suggests something like
>  $0$1  $2  $3$1$2$4,  but I'm only guessing about the
> "&" bit.

Yes.


> In the following,
> 
> / (a) [ (b) (c) | $5 := (d) $0 := (e) ] (f) /
> 
> does the first alias have any effect on where the f's will go
> (probably not)?

I'll defer to @Larry on this one, but my initial impression is
that the (f) capture would go into $6.

> - Which rules do apply to repeated captures with the same alias? For
> example,
> the second array aliasing example
> 
> m:w/ Mr?s? @ :=  W\. @ := 
>| Mr?s? @ := 
>/;
> 
> seems to suggests that by using $, the lower branch would have
> resulted in a single Match object instead of an array (like the array we
> would have gotten if we hadn't used the aliases in the first place). Is
> this right? 

Yes, that's correct.

> And could the same effect have been achieved by something
> like
> 
> / $ := **{1} / ?

Yes, a quantified capturing subrule or subpattern results in an
array of Match objects (even if the quantification is "1").

> - More array aliasing:
> 
> is  / mv  @ := [...]*  /
> just (slightly) shorter for / mv [$ := [...]]* / ?

I think so.

> Likewise, could/   @ := ( (\w+) \: (\N+) )+ /
> have also been written / [ $ :=   (\w+) \: $ := (\N+) ]+ / ?

Seems like it would work.

> - Array and hash aliasing of quantified subpatterns or subrules: what
> happens
> to the named captures?
> 
> / @ := ( ... $bar := (...) ... )* /

Presuming you meant $ there instead of $bar, I have no idea
what would happen.  (With $bar it's an external alias and would
capture an array of matches into the scope in which the rule was
declared.)

> And if the subpattern or subrule ends with an alternation, can the
> number of
> array elements to be appended (or hashed) vary depending on whitch
> branch is
> taken?

Again I have to refer this to @Larry, but my initial impression is
"yes, it would vary".

> - Which of the following constructs could possibly be ok (I hope, none)?
> 
> / $ := ... & $ := ... /

I think this one is okay.  $ is an array of Match objects, and
each Match is likely repeated within the array.

> / $ := ...   % := ... /

I hope this is not okay.  It's certainly not going to be okay anytime
soon in the PGE implementation of Perl 6 rules.  :-)

> / $ := ... | % := ... /

Since the two aliases are in separate alternation branches, I think
this is okay.  The argument would be similar to

/ $ := ... | @ := .../

in which $ is either a single Match object or an array of
Match objects depending on the branch matched.

> / $ := $ := ... /

While my instinctual reaction is to say that this ought to be okay,
upon thinking about it a bit more I think I'd prefer to say that
it's not.  At least initially, if nothing else.  In particular, I 
wonder about something like

/ @ := $ := [...]+ /

If we say that an alias always requires a subpattern or subrule
(and not another alias), then we avoid a lot of ambiguity, and the
above could be written as

/ @ := [ $ := [...]+ ] /
/ @ := [ $ := [...] ]+ /

depending on what is desired.

> - Do aliases bind right-to-left, as do assignments?
> / $2 := $5 := ... /   # next should be $3, not $6

Assuming we allow chained aliases such as this (see above note),
I'd still argue for $6 instead of $3.

> - Which kind of escape sequences are allowed (or required) in enumerated
> character classes?

AFAIK, this hasn't been completely decided or specified yet.  

Pm


Re: some newbie questions about synopsis 5

2006-02-17 Thread Patrick R. Michaud
On Fri, Feb 17, 2006 at 02:33:12PM +0100, H. Stelling wrote:
> Patrick R. Michaud wrote:
> >>In the following,
> >>
> >>/ (a) [ (b) (c) | $5 := (d) $0 := (e) ] (f) /
> >>
> >>does the first alias have any effect on where the f's will go
> >>(probably not)?
> >
> >I'll defer to @Larry on this one, but my initial impression is
> >that the (f) capture would go into $6.
> 
> I think that sequences should behave exactly as single branch
> alternations (only that there is no such thing, although we
> can write "[foo|]"). So I would rather opt for $1.

The current implementation is that a capturing subpattern
is indexed based on the largest index in all of the alternation
branches.  I'm not sure it makes sense to base it on aliases of 
the last alternation branch.  

Here are some examples we can chew on:

/ (a) [ (b) (c) | (d) ] (f) / # (f) is $3 or $2?  (currently $3)

/ (a) [ (b) (c) | $1 := (d) ] (f) /   # (f) is $3 or $2?

Since the second example is essentially saying the same as the first,
the (f) capture ought to go to the same place in each case.  If we
say that the existence of the $1 causes the (f) to go into $2, it
also becomes the case that $2 is an array of match objects, which
isn't technically problematic but it might be a bit surprising for
many.

Some other examples to consider:

/ (a) [ (b) (c) | $0 := (d) ] (f) /   # (f) is $3 or $1?  

/ (a) [ (b) (c) | $0 := (d) (3) ] (f) /   # (f) is $3 or $2? 

At any rate, I find that having a subpattern capture base its
index on the highest index of all of the previous alternation
branches is easy to understand and works well in practice.  It can
also be easily changed with another alias if needed.

> But wouldn't it be nice if the same rules applied to aliases and
> subrule invocations, that is, recursion put aside, to think of
> 
> /  /
> 
> simply as a shorter way to say
> 
> / $ := ([definition of foo]:) /?

First, is that colon following "[definition of foo]" intentional or
a typo?  Currently we can backtrack into subrules -- there's no "cut"
assumed after them.

But secondly, I'm not sure we can casually toss recursion
aside when thinking about this, since it's really a driving force 
behind having named subrules.  :-)  There's also a difference in
that subrules can take arguments, as in , or can come
from another grammar, as in , which seems to argue that 
 is really something other than an alias shorthand.

> The synopsis says:
> 
> * If a subrule appears two (or more) times in the same lexical scope
>   (i.e. twice within the same subpattern and alternation), or if the
>   subrule is quantified anywhere within the entire rule, then its
>   corresponding hash entry is always assigned a reference to an array
>   of Match objects, rather than a single Match object.
> 
> Maybe you're not the right person to ask, but is there a particular
> reason for the "entire rule" bit?
> 
> / (|None)  () /
> 
> Here we get three Matches $0 (possibly undefined), $, and
> $1. At least, I think so.
> 
> / (?)  () /
> 
> Now, we suddenly get three more or less unrelated arrays with lengths
> 1..1, 1, and 1. Of course, I admit this example is a bit artificial.

Oh, I hadn't caught that particular clause (or hadn't read it as
you just did).  PGE certainly doesn't implement things that way.
I think the "entire rule" clause was intended to cover cases like

/ [  ]* /

where  is indirectly quantified and therefore is an array of
match objects.  We should probably reword it, or get a clarification
of what is intended.  (Damian, @Larry:  can you confirm or clarify
this for us?)

> Furthermore, I think "within the same subpattern and alternation" is
> not quite correct, at least it wouldn't apply to somethink like
> 
> / ( [  | ... ]) /
>
> unless we consider the (...) sequence as a kind of single branch
> alternation. And why are alternation branches considered to be
> lexical scopes, anyway? 

In the example you give, $0 is indeed an array of match objects.
The "same alternation" in this case is the subpattern... compare to

   / ( [  | ... ]) |  /

$0 is an array, $ is a single match object.

Alternation branches don't create new lexical scopes, they just
affect quantification and subpattern numbering.  In both of the 
following examples

/ abc  def  /

/ ghi  | jkl  /

each  has the same lexical scope ($), but in the "abc"
example $ is an array of match objects, while in the "ghi"
example $ is a single match object.

> My second question is why adding a "?" or "??" to an unquantified
> subrule which would otherwise result in a single Match object shoul

Re: $a.foo() moved?

2006-04-06 Thread Patrick R. Michaud
On Thu, Apr 06, 2006 at 03:38:59PM -0400, John Macdonald wrote:
> On Thu, Apr 06, 2006 at 12:10:18PM -0700, Larry Wall wrote:
> > The current consensus on #perl6 is that, in postfix position only (that
> > is, with no leading whitespace), m:p/\.+ \s / lets you embed
> > arbitrary whitespace, comments, pod, etc, within the postfix operator.
> > 
> 
> The one quibble I see with this is that postfix  dots, including 3> might be a touch confusing with infix 3 dots> (i.e. the yada operator).  

There isn't an infix:<...> operator.  There's 
term:<...> ("yada yada yada"), and there's 
postfix:<...> ("$x..Inf").

Pm


Re: Another dotty idea

2006-04-07 Thread Patrick R. Michaud
On Fri, Apr 07, 2006 at 06:31:44PM -0700, Jonathan Lang wrote:
> Delimiter-terminated quotes.  Really nice idea.
> 
> I'd put the dot inside the comment: "#.x", with x being an optional
> quote delimiter (excluding dots).  If a delimiter is included, the
> comment is terminated by the matching quote delimiter; if absent, the
> comment is terminated by the next dot.

But if one is going to go this route (and I'm not sure that we should),
then when the delimiter is absent have the comment terminate at
the first non-whitespace character.  

In other words, "#" terminates at a newline, but "#.\s" terminates
at the next non-whitespace character.  

This gives us:

$x#..foo()
$x#..()
$x#.()

which still allows us to balance dots if we wish (but is not
required).  Nicely, it still looks like we're inserting a comment,
since '#' already means 'comment'.

We can still have the delimited forms of this comment if we want,
but maybe this is a reasonable approach to handling dots.

Pm



Re: Another dotty idea

2006-04-07 Thread Patrick R. Michaud
On Fri, Apr 07, 2006 at 07:00:29PM -0700, Jonathan Lang wrote:
> Patrick R. Michaud wrote:
> > Jonathan wrote:
> > > If a delimiter is included, the
> > > comment is terminated by the matching quote delimiter; if absent, the
> > > comment is terminated by the next dot.
> >
> > But if one is going to go this route (and I'm not sure that we should),
> > then when the delimiter is absent have the comment terminate at
> > the first non-whitespace character.
> 
> ...which makes "#.\s" good only for inserting whitespace where it
> normally wouldn't belong.  

Well, even if we say terminate at the next dot, that's saying
that our inserted comments cannot contain dots in them.  

But if we want to terminate at another dot, perhaps we should
explicitly specify an extra dot as the delimiter.  We still get
some nice symmetries this way:

$x#.. ..foo()
$x#.. ..()

$x#.. this is a delimited comment ..()
$x#.[ this is also a delimited comment ].()

Thus "#.\s" just happens to mean that we're using whitespace
(or actually, the end of whitespace) as a delimiter.

Somehow I have the nagging feeling that Larry will once again
take these ideas and come up with something totally unexpected
and simultaneously awesome, as he often does.  (At least, I hope
that's the case.)

Pm


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
First, let me say I really like the changes to S05.  Good work
once again.

Here are my questions and comments.

On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
> -(To get rule interpolation use an assertion - see below)
> +However, if C<$var> contains a rule object, rather attempting to
> +convert it to a string, it is called as if you said C<< <$var> >>.

Does this mean it's a capturing rule?  Or is it called as
if one had said  C<<  >>?   (I would prefer it default
to non-capturing.)

> +If it is a string, it is matched literally, starting after where the
> +key left off matching.
> ..
> +If it is a rule object, it is executed as a subrule, with an initial
> +position after the matched key.
> ..
> +If it has the value 1, nothing special happens except that the key match
> +succeeds.
> ..
> +Any other value causes the match to fail.  In particular, shorter keys
> +are not tried if a longer one matches and fails.

Is there a way to say to continue with the next shortest key?

> +Note: the effect of a forward-scanning lookbehind at the top level
> +can be achieved with:
> +
> +/ .*? prestuff <( mainpat >) /

That should probably be

/ .*? prestuff <( mainpat )> /


> +As with bare hash, the longest key matches according to the longest token
> +rule, but in addition, you may combine multiple hashes under the same
> +longest-token consideration like this:
> +
> +<%statement|%prefix|%term>

This will be interesting from an implementation perspective.  :-)

> +It is a syntax error to use an unbalanced C<< <( >> or C<< )> >>.

On #perl6 I think it was discussed that C<< <( >> and C<< )> >>
could be unbalanced -- that the first simply set the "from"
position and the second set the "to/pos" position.  I think I
would prefer this.

Assuming we require the balance, what do we do with things like...?

/ aaa <( bbb { return 0; } ccc )> ddd /

And are we excluding the possibility of:

/ aaa <( [ bbb )> ccc 
 | dd ee )> ff 
 ]
/

(The last example might be the anti-use case that shows that
<( and )> ought to be properly nested and balanced.)

> +Conjecture: Multiple opening angles are matched by a corresponding
> +number of closing angles, and otherwise function as single angles.
> +This can be used to visually isolate unmatched angles inside:
> +
> +<<> 1>>>

Does this eliminate the possibility of ever using french angles
as a possible rule syntax character?  (It's okay if it does, 
I simply wanted to make the observation.)

> +Just as C has variants, so does the C declarator.
> +In particular, there are two special variants for use in grammars:
> +C and C.

I agree with Audrey that C is probably too useful in other
contexts.  C works fine for me.

> +With C<:global> or C<:overlap> or C<:exhaustive> the boolean is
> +allowed to return true on the first match.  

Nice, nice, nice!  Makes things *much* simpler for PGE.

Pm


Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
On Thu, Apr 20, 2006 at 09:24:09AM -0500, Patrick R. Michaud wrote:
> First, let me say I really like the changes to S05.  Good work
> once again.
> 
> Here are my questions and comments.
> 
> On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
> > -(To get rule interpolation use an assertion - see below)
> > +However, if C<$var> contains a rule object, rather attempting to
> > +convert it to a string, it is called as if you said C<< <$var> >>.
> 
> Does this mean it's a capturing rule?  Or is it called as
> if one had said  C<<  >>?   (I would prefer it default
> to non-capturing.)

Sorry, I meant C<<  >> here, except we don't really 
have a  syntax, so my question is just if it's capturing
or non-capturing.  (I still prefer non-capturing.)

Pm



Re: [svn:perl6-synopsis] r8883 - doc/trunk/design/syn

2006-04-20 Thread Patrick R. Michaud
On Thu, Apr 20, 2006 at 09:19:48AM -0700, Larry Wall wrote:
> : > +Any other value causes the match to fail.  In particular, shorter keys
> : > +are not tried if a longer one matches and fails.
> : 
> : Is there a way to say to continue with the next shortest key?
> 
> Yeah, use <@rules> rather than <%tokens>.  :)
> 
> Actually, how about we say that '' just succeeds, and a number says to
> retry ignoring keys longer than the number?

s/retry/continue trying/, perhaps?

Using '' (instead of 1) as the success value sounds Good, since 
null string always matches following a key.  Taking "ignoring keys
longer than the number" literally, would we also read this then 
that returning 0 tries the (remaining) empty keys of each hash, 
and returning -1 fails the matching of <%tokens>?

> [ discussion of unbalanced <( ... )>
> I'm inclined to say that the conservative thing is to require balance.
> We could relax it later, I suppose.

Works for me.

> : > +Just as C has variants, so does the C declarator.
> : > +In particular, there are two special variants for use in grammars:
> : > +C and C.
> : 
> : I agree with Audrey that C is probably too useful in other
> : contexts.  C works fine for me.
> 
> Aesthetically, I hate :w, actually...and the whole point of naming "token"
> is that it is *not* a normal parser rule, but a lexer rule.
> 
> But I agree that "parse" is probably the wrong word.  Earlier versions
> had "prod" (short for "production") or "words".  

Two other ideas (from a short walk)... how about something along
the lines of "phrase" or "sequence"?  

Pm


Re: A rule by any other name...

2006-05-10 Thread Patrick R. Michaud
On Wed, May 10, 2006 at 06:07:54PM +1000, Damian Conway wrote:
> 
> >Including :skip(//). Yes, agreed, it's a huge 
> >improvement. I'd be more comfortable if the default rule to 
> >use for skipping was named  instead of . 
> >(On IRC  was also proposed, but the connection between
> >:skip and  is more immediately obvious.)
> 
> Yes, I like  too. I too keep mistakely reading  as "WhiteSpace".

FWIW, I recently noticed noticed in another language
definition the phrase "intertoken space" as being something
that can occur on either side of any token, but not within
a token.  Perhaps some abbreviation or variation of that could 
work in place of either "ws" or "skip".

(Somehow "skip" seems too verbish to me, when the other
subrules we tend to see in a rule tend to be nounish.  Yes, I 
know that "skip" can be a noun as well, it just feels wrong.)

> I'm still utterly convinced my original three-keyword list is the right one 
> (and that the three keywords in it are the right ones too). 

Having played with regex/token/rule in the perl6 grammar a bit
further, as well as looking at a couple of others, I'm finding 
regex/token/rule to be fairly natural.  It only becomes unnatural
if I'm trying hard to optimize things -- e.g., by using "token" instead
of "rule" to avoid unnecessary calls to .  (And it may well turn
out that trying to avoid these calls is a premature or incorrect
optimization anyway -- I won't know until I'm a little farther along
in the grammars I'm work with.)

Pm


Re: A rule by any other name...

2006-05-10 Thread Patrick R. Michaud
On Wed, May 10, 2006 at 05:58:57PM -0700, Allison Randal wrote:
> To summarize a phone call today, the more intelligent defaults we add to 
> differently named rule keywords the more comfortable I am with having 
> different names. So, here's what we have so far (posted both as an FYI 
> and to confirm that we have the coherent solution I think we have):
> [...]
> skip:
> - We keep :words as shorthand for :skip(//)
> - And :skip is shorthand for :skip(//)
> [...]

Please, describe these with  and  to make clear their
non-capturing semantic.  :-)

But Allison's message helps me to crystallize what has been
bugging me about the term ":skip" (and to a lesser extent ":words")
in describing what they do.  So, I'll offer my thoughts here
in case anyone wants to pick it up before we go a-changing S05
yet again.  (If no-one picks it up, I'll just wait for S05 to
be updated to whatever is decided and implement that. :-)

Whitespace in regexes and rules is metasyntactic, in that it is 
not matched literally.  Effectively what the :w (or :words or 
:skip) option does it to change the metasyntactic meaning of 
any whitespace found in the regex.  Or, another way of thinking
of it -- as S05 currently stands, 'regex' and 'token' cause
the pattern whitespace to be treated as , while 'rule'
causes the pattern whitespace to become .

So what we're really doing with this option--whatever we 
call it--is to specify what the whitespace _in the pattern_
should match.  Somehow ":skip" and  don't carry that
meaning for me.

In some sense it seems to me that the correct adverb is
more along the lines of :ws, :white, or :whitespace, in that
it says what to do with the whitespace in the pattern.  It
doesn't have to say anything about whether the pattern's
whitespace is actually matching \s* (although the default
rule for :ws/:white/:whitespace could certainly provide that
semantic).

I can fully see the argument that people will still
confuse :ws and  with "whitespace in the target", 
when in reality they specify the meaning of whitespace
in the regex pattern, so :ws might not be the right choice
for the adverb.  But I think that something more closely 
meaning "whitespace in the pattern means /this/" would be a 
better adverb than :skip.

If someone *really* wants to use "skip", there's always
:ws(//) (or whatever we choose) which means 
"whitespace in the regex matches ".

> -  is a single character of obligatory whitespace

This one has bugged me since the day I first saw it implemented
in PGE.  We _already_ have \s, , and  to represent 
the notion of "a whitespace character" -- do we really need a 
separate  form also?  (An idle thought: perhaps "sp" is
better used as an :sp adverb and a corresponding  regex?)

Pm


Re: A rule by any other name...

2006-05-11 Thread Patrick R. Michaud
On Thu, May 11, 2006 at 08:57:53PM +0800, Audrey Tang wrote:
> Patrick R. Michaud wrote:
> >> -  is a single character of obligatory whitespace
> 
> Hmm, it's literal ' ' (that is, \x20), not "whitespace" in general,
> right?  For "obligatory whitespace" we have \s.

Oops, you're correct, I forgot that  is already \x20.

Allison's proposed definition of  above seems to want to
change that to "obligatory whitespace".  That's more of what
I was reacting against.

For summary, here's how I currently read S05's space/whitespace
rules (and what PGE implements, or is expected to implement):

  space character:  \x20  \o40  <' '><[ ]>  <+[ ]>  backslash+space
  whitespace:   \s

> > We _already_ have \s, , and  to represent 
> > the notion of "a whitespace character" -- do we really need a 
> > separate  form also?  (An idle thought: perhaps "sp" is
> > better used as an :sp adverb and a corresponding  regex?)
> 
> Well, without // to stand for /\x20/, it'd have to be written as
> /<' '>/, which is a bit suboptimal.  [...]

I agree,  makes more sense as \x20, so I retract my idle thought.

Thanks,

Pm


Re: About default options ':ratchet' and ':sigspace' on rules

2006-06-02 Thread Patrick R. Michaud
On Fri, Jun 02, 2006 at 02:17:25PM +0800, Shu-chun Weng wrote:
>  1. Spaces at beginning and end of rule blocks should be ignored
> since space before and after current rule are most likely be
> defined in rules using current one.
>  1a. I'm not sure if it's "clear" to define as this, but the spaces
>  around the rule-level alternative could also be ignored.  

At one point I had been exploring along similar lines, but at the
moment I'd say we don't want to do this.  See below for an example...

>  For instance, look at the rule FunctionAppExpr defined in
>  MiniPerl6 grammar file.
> 
>rule FunctionAppExpr
> {|||[?<'('>?<')'>]?}

FWIW, I'd go ahead and write this as a token statement instead of
a rule:

token FunctionAppExpr {
| 
| 
| 
|  [  \(   \) ]?
}

In fact, now that I've written the above I'm more inclined to say 
it's not a good idea to ignore some whitespace in rule definitions
but not others.  Consider:

rule FunctionAppExpr {
| 
| 
| 
| [ \(  \) ]?
}

Can we quickly determine where the  are being generated? 
What if the [...] portion had an alternation in it?

(And, if we ignore leading/trailing whitespace in rule blocks, do 
we also ignore leading/trailing whitespace in subpatterns?)

In a couple of grammars I've developed already (especially the
one used for pgc.pir), having whitespace at the beginning of rules
and around alternations become  is useful and important.
In these cases, ignoring such whitespace would mean adding explicit
 in the rule to get things to work.  At that point it feels like
waterbed theory -- by "improving" things for the FunctionAppExpr
rule above we're pushing the complexity somewhere else.

In general I'd say that in a production such as FunctionAppExpr
where there are just a few places that need , then it's
better to use 'token' and explicitly indicate the allowed
whitespace.

(Side observation: in  ...|[?<'('>?<')'>]?}
above, there's no whitespace between  and the closing paren.
Why not?)

>  2. I am not sure the default rule of , I couldn't found it in
> S05.  Currently the engine use :P5/\s+/ but I would like it to
> be :P/\s*/ when it's before or after non-words and remains
> the same (\s+) otherwise.

PGE does the "\s* when before or after non-words and \s+ otherwise"
explicitly in its  rule, which is written in PIR.  (Being able
to write subrules procedurally is I nice.)  

In P5 it'd probably be something like 

(?:(?

Re: grammar: difference between rule, token and regex

2006-06-02 Thread Patrick R. Michaud
On Fri, Jun 02, 2006 at 01:56:55PM -0700, jerry gay wrote:
> On 6/2/06, Rene Hangstrup Møller <[EMAIL PROTECTED]> wrote:
> >I am toying around with Parrot and the compiler tools. The documenation
> >of Perl 6 grammars that I have been able to find only describe rule. But
> >the grammars in Parrot 0.4.4 for punie and APL use rule, token and regex
> >elements.
> >
> >Can someone please clarify the difference between these three types, and
> >when you should use one or the other?
>
> i'm forwarding this to p6l, as it's a language question and probably
> best asked there. that said, the regex/token/rule change is a recent
> one, and is documented in S05
> (http://dev.perl.org/perl6/doc/design/syn/S05.html)

Jerry is correct that S05 is the place to look for information
on this.  But to summarize an answer to your question:

   - a C is a "normal" regular expression

   - a C is a regex with the :ratchet modifier set.  The
 :ratchet modifier disables backtracking by default, so that
 a plain quantifier such as '*' or '+' will greedily match whatever
 it can but won't backtrack if the remainder of the match fails.

   - a C is a regex with both the :ratchet and :sigspace
 modifiers set.  The :sigspace modifier indicates that whitespace
 in the rule should be replaced by a intertoken separator rule
 such as  (a whitespace matching rule).

So,

rule { a* c b+ }

is the same as

token {  a*  c  b+  }

is the same as

regex { : a*: : c : b+:  }


To answer your other question, about when to use each, here are
some rules of thumb (sorry for the pun):

  - If the quantifiers in the rule need to do backtracking, use 'regex'

  - If backtracking isn't needed, use 'token'

  - If the components of the regex can have intertoken separators
between them, use rule (and perhaps define a custom  rule
that matches the language's idea of "intertoken separator").

Here's a quick contrived example to illustrate the difference:

token identifier {  \w* }

token integer { \d+ }

token value {  |  }

token operator { \+ | - | \* | / }

rule expression {  [   ]* }

rule assignment {  \:=  }

The "token" declarations all define regexes that do not match
any whitespace.  Thus,  "abc" is a valid identifier but "   abc "
is not.

The rule declarations, however, allow for whitespace to occur
between each of the elements.  Thus, each of the following
are valid assignments in the above language, as the use of
"rule" tells us where whitespace is allowed in the match:

 b:=3+a*4
 b := 3 + a * 4
 b   :=3   +a*   4

I can come up with more examples if desired, but that's the basics
behind each.

Hope this helps,

Pm


Re: lexical lookup and OUTER::

2006-06-24 Thread Patrick R. Michaud
On Sat, Jun 24, 2006 at 04:52:26PM -0700, Audrey Tang wrote:
> $x = 1 if my $x;
> 
> The compiler is "allowed" to complain, but does that means it's also  
> okay to not die fatally, and recover by pretending as if the user has  
> said this?
> 
> # Current Pugs behaviour
> $OUTER::x = 1 if my $x;

I think that a statement like  C<< $x = 1 if my $x; >> ought to
complain.  

Put slightly differently, if it's an error in any of the compilers,
it probably should be an error in all of them.

> If it's required to complain, then the parser need to remember all  
> such uses and check it against declaration later, and it'd be better  
> to say that in the spec instead.

I think that S04's phrase "then it's an error to declare it" 
indicates that this should always be treated as an error.  How/when
the compiler chooses to report the error is up to the compiler.  :-)
That said, I wouldn't have any objection to removing or altering
"the compiler is allowed to complain at that point" phrase so
as to remove this particular ambiguity.

Pm


Re: Motivation for /+/ set Array not Match?

2006-09-22 Thread Patrick R. Michaud
On Fri, Sep 22, 2006 at 10:22:52PM +0800, Audrey Tang wrote:
> Moreover:
> 
>/ bar bar +/
> 
> should set $ to an Array with two Match elements, the first being a
> simple match, and the second has multiple positional submatches.
> 
> The thinking behind the separate treatment is that in a contiguous  
> quantified
> match, it does make sense to ask the .from and .to for the entire  
> range, which
> is very hard to do if it's an Array (which can have 0 elements,  
> rendering $[-1].to
> dangerous).  


Out of curiosity, why not:

/ bar bar $:=(+)/

and then one can easily look at $.from and $.to, as well
as get to the arrayed elements?  (There are other possibilities as
well.)

I'm not arguing in favor of or against the proposal, just pointing
out that there are ways in the existing scheme to get at what is
wanted.

Pm


Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
> A quick scan of S05 reveals definitions for these seven special named 
> assertions:
>   [...]

I don't think that <'...'> or <"..."> are really "named assertions".

I think that  (as well as <+xyz> and <-xyz>) are simply special forms
of the named assertion .

I should probably compare your list to what PGE has implemented and see if
there are any differences -- will do that later tonight.

Pm



Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 09:12:02PM +, [EMAIL PROTECTED] wrote:
> The documentation should distinguish between those that are just 
> pre-defined characters classes (E.G.,  and ) and 
> those that are special builtins (E.G.,  and .  
> The former are things that you should be freely allowed to redefine 
> in a derived grammar, while the other second type may want to be 
> treated as reserved, or at least mention that redefining them may 
> break things in surprising ways.

FWIW, thus far in development PGE doesn't treat 
and  as "special built-ins" -- they're subrules, same
as  and , that can indeed be redefined by 
derived grammars.

And I think that one could argue that redefining  or
 could equally break things in surprising ways.  

I'm not arguing against the idea of special builtins or saying it's
a bad idea -- designating some named assertions as "special/non-derivable" 
could enable some really nice optimizations and implementation shortcuts  
that until now I've avoided.  I'm just indicating that I haven't
come across anything yet in the regex implementation that absolutely
requires that certain named assertions receive special treatment
in the engine.

Thanks,

Pm

>  ------ Original message --
> From: "Patrick R. Michaud" <[EMAIL PROTECTED]>
> > On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
> > > A quick scan of S05 reveals definitions for these seven special named 
> > assertions:
> > >   [...]
> > 
> > I don't think that <'...'> or <"..."> are really "named assertions".
> > 
> > I think that  (as well as <+xyz> and <-xyz>) are simply special forms
> > of the named assertion .
> > 
> > I should probably compare your list to what PGE has implemented and see if
> > there are any differences -- will do that later tonight.
> > 
> > Pm
> > 
> 
> 
> 


Re: Major bullet biting on | vs || within regex

2007-01-16 Thread Patrick R. Michaud
On Tue, Jan 16, 2007 at 10:41:03AM -0800, Larry Wall wrote:
> Note, in case you don't read synopsis checkins: the previous checkin
> majorly changes the semantics of | within regex to support required
> longest-token matching semantics rather than left-to-right matching.
> This is nearly on the same philosophical level as requiring the
> tail-recursion optimization.  It will enable us to write parsers
> more consistently, and it also opens up normal regexes to better
> optimization via tries and such.  You can now use || for the old |
> semantics, which is majorly consistent with how | and || work outside
> of regexen.

Do we leave C<&> alone (as opposed to introducing a corresponding C<&&>
operator)?  I can see arguments both ways.

Pm


Parrot 0.4.9 released!

2007-02-22 Thread Patrick R. Michaud
On behalf of the Parrot team, I'm proud to announce Parrot 0.4.9,
"Socorro." Parrot (http://parrotcode.org) is a virtual machine aimed
at running all dynamic languages.

Parrot 0.4.9 can be obtained via CPAN (soon), or follow the
download instructions at http://www.parrotcode.org/source.html .
For those who would like to develop on Parrot, or help develop 
Parrot itself, we recommend using Subversion or SVK on the
source code repository to get the latest and best Parrot code.

Parrot 0.4.9 News:
- Compilers:
   + IMCC: Parrot calling conventions now available in C PMCs, allowing
 named, optional, slurpy, and flat parameter passing
   + PGE: extended support for Perl 5 Regexes
   + smop: prototype object model implementation
   + hllcompiler: refactored to run a configurable set of compilation stages
- PAST:
   + redesigned assign/binding to support Perl 6 binding semantics
- Languages:
   + Updated Lua, PHP ("Plumhead"), Tcl ("ParTcl"), perl6, perl5
   + New language: PIR - a PGE-based implementation of Parrot PIR
   + perl6 now supports binding (':=') and 'join'
   + lua generates tail calls, and supports its own regex flavor (PGE-based)
   + Pheme still works, huzzah!
- Design:
   + PDD21 "Objects" - rewritten
   + PDD22 "I/O" - updated and 'TODO' tests added
- Documentation:
   + Interface stability classification standards approved
   + Roles and Responsibilities documented approved
   + Official 'drafts' directory created (was 'clip')
- Implementation:
   + More NameSpace and OS PMC methods implemented
   + Parrot executable fullname and basename now available in PIR/PASM code
   + new 'chomp' library function
- Build:
   + Major improvements in test coverage for 'ops2pm.pl'
- Misc:
   + many bugfixes, enhancements, and coding standard updates
   + extended support for Sun Workshop Compilers
   + Parrot now builds on PocketPC platform

Thanks to all our contributors for making this possible, and our
sponsors for supporting this project.

Enjoy!

Pm


Re: perl6-synopsis svn

2007-02-22 Thread Patrick R. Michaud
On Fri, Feb 23, 2007 at 12:35:15AM +, Blair Sutton wrote:
> Hi Larry
> 
> Sorry if this is a silly question but I haven't been able to find the 
> answer. Is the perl6-synopsis SVN repository publicly available or is it 
> in the same repository as that of Parrot or Pugs?

The versions that appear on dev.perl.org are available from
http://svn.perl.org/perl6/doc/trunk/design/syn/ .

There are also some draft synopses available from 
http://svn.pugscode.org/pugs/docs/Perl6/Spec/ .

Hope this helps!

Pm



Re: What criteria mark the closure of perl6 specification

2007-02-25 Thread Patrick R. Michaud
On Sun, Feb 25, 2007 at 09:42:22AM +0300, Richard Hainsworth wrote:
> 
> While perl6 remains unstable in its specification (or is perceived to be 
> that way) and is looking (from outside a select group?) like a unending 
> road, wont this act as a deterrent to those who want to help hack it 
> into existence, usefulness and stability?

Just to add another perspective... I should note that ongoing
changes to the perl6 specification have _not_ been obstacles 
or deterrents to getting the Perl 6 on Parrot implementation 
in place -- in fact, they've been uniformly helpful.

The delays in the Perl 6 compiler for Parrot have been largely
due to other items, including time and the design work needed
for some of Parrot's compiler subsystems, libraries, and tools.

Pm


Re: Packed array status?

2007-02-26 Thread Patrick R. Michaud
On Sun, Feb 25, 2007 at 03:48:47PM -0800, chromatic wrote:
> On Sunday 25 February 2007 12:40, Geoffrey Broadwell wrote:
> 
> > What backends support packed native arrays at this point?  And what's
> > the performance like?
> 
> I don't know if Patrick has using PIR libraries working in Perl 6 
> yet, but the last time we talked about it, he said it would take 
> just a bit of work.

No, I don't have them working yet, but implementing them shouldn't
be too difficult.  I just need to have perl6 recognize imported
classnames.  (The syntax for making method calls is already in
place and working.)

Pm


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Patrick R. Michaud
On Thu, Aug 02, 2007 at 04:19:18PM -0700, [EMAIL PROTECTED] wrote:
>  Increment of a C (in a suitable container) works similarly to
>  Perl 5, but is generalized slightly.  First, the string is examined
>  to see if it could be the string representation of a number in
>  any common representation, including floating point and radix
>  notation. (Surrounding whitespace is also allowed around such a
>  number.)  If it appears to be a number, it is converted to a number
>  and incremented as a number.  

Just for verification:  an increment of "0xff" will therefore
result in 256 and not "0xfg".  Correct?

>  final alphanumeric sequence in the string.  Unlike in Perl 5, this
>  alphanumeric sequence need not be anchored to the beginning of the
>  string, nor does it need to begin with an alphabetic character; the
>  final sequence in the string matching C<\w+> is incremented regardless
>  of what comes before it.  

...does the \w+ include non-ASCII alphanumerics and underscore?  
Or should the spec limit itself to [A-Za-z0-9]+ here?  If we
include non-ASCII alphanumerics, then incrementing something like
"résumé" produces "résumf" ?

Pm


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Patrick R. Michaud
On Sat, Aug 04, 2007 at 12:56:06PM -0700, Larry Wall wrote:
> for '❶' .. '❿' { .say }
> 
> But it's not clear what to do if you try to increment ❿ though.
> Probably just return a failure.

Assuming that '❶' .. '❿' is a range similar to '0'..'9', then
consistency with the other ranges would seem to indicate that
incrementing 'a❿'  produces 'b❶', and incrementing '❿' on its
own would produce '❶❶'.  (Unless, of course, '❿' is treated as
a "number in any common representation", in which case incrementing
it produces 11.)

I'm not saying that anything involving the dingbats makes good
sense -- just that this is what I would tend to expect to happen
based on how the other ranges autoincrement.  Feel free to insert
pithy quotes about consistency and hobgoblins here.  :-)

And so we don't get bogged down in (relatively unimportant) details, 
I'll refrain from shouting "look at the ugly corner cases!" for now 
and leave it to others to decide how/when to push this.  The changes to 
S03 and clarifications give me enough to proceed for now -- namely:

Strings that look like numbers or that don't end in \w+ are numified 
and then incremented, whereas strings ending with \w+ are incremented 
according to individual character ranges.  The exact set of ranges are 
still under discussion, but the ranges A-Z, a-z, and 0-9 have the 
"expected" semantics.  

Others can continue on the discussion if wanted, but as an implementor
I'm happy with this outcome for now.  :-)

Thanks!

Pm


Parrot 0.4.15 "Augean Stable" released!

2007-08-22 Thread Patrick R. Michaud
On behalf of the Parrot team, I'm proud to announce Parrot 0.4.15
"Augean Stable." Parrot (http://parrotcode.org/) is a virtual 
machine aimed at running all dynamic languages.

Parrot 0.4.15 can be obtained via CPAN (soon), or follow the
download instructions at http://parrotcode.org/source.html.
For those who would like to develop on Parrot, or help develop
Parrot itself, we recommend using Subversion or SVK on the
source code repository to get the latest and best Parrot code.

Parrot 0.4.15 News:
- Implementation:
 + Lots of code review, many bugs fixed
 + Many more code cleanups and compiler warning levels
 + Started a new jit engine for 64-bit processors
 + Refactored configure process, with tests and new diagnostic options
 + Added new CodeString PMC for dynamic generation of PIR code
 + More pdd15 support for object metamodel.
- Languages:
 + Added NQP ("Not Quite Perl"), a very lightweight Perl 6-like language
 + Significant improvements and refactors to PCT (Parrot Compiler Toolkit)
 + perl6 passes more spec tests
 + Lua works now with a PGE/TGE/PAST-pm based compiler, lives in one pbc,
   and the interpreter has same behavior as original.
- Documentation
 + Added a committers' HOWTO
 + More PIR tutorial examples
 + Added PAUSE guide


Thanks to all our contributors for making this possible, and our
sponsors for supporting this project.

Enjoy!



Re: &, &&, and backtracking.

2007-09-06 Thread Patrick R. Michaud
On Wed, Sep 05, 2007 at 09:36:24PM -0500, Jonathan Scott Duff wrote:
> How do C<&> and C<&&> differ with respect to backtracking?  For instance,
> 
> "foobar" ~~ / <[a..z]>+ & [ ... ] /;
> 
> Both sides of the C<&> happen in parallel, so I would guess that they
> both match "foo" then stop. Please correct me if that's wrong.

I think the phrase "happen in parallel" overstates things a bit
here.  From S05:

The & form is considered declarative rather than procedural;
it allows the compiler and/or the run-time system to decide 
which parts to evaluate first, and it is erroneous to assume 
either order happens consistently.  The && form guarantees 
left-to-right order, and backtracking makes the right argument 
vary faster than the left.

So, to answer your original question, I think the most we can
say is that C<&&> guarantees a specific order of evaluation,
while C<&> allows the pattern matcher to choose an ordering.

> Were we using the procedural conjunction:
> 
> "foobar" ~~ / <[a..z]>+ && [ ... ] /;
> 
> I would guess that the LHS matches as much as it can ("foobar"), then
> the RHS matches "foo" [...and then backtracks the LHS until a 
> conjunctional match is found...]
>
> Or it's much simpler than that and both of the regexes above just fail
> because of the greediness of C<+> and there is no intra-conjunction
> backtracking.

I think we definitely allow intra-conjunction backtracking.
PGE implements it that way.


On a somewhat similar question, what happens with a pattern
such as

"foobar" ~~ / foo.+? | fooba /

The LHS initially matches "foob", but with backtracking could
eventually match "foobar".  Do the longest-token semantics
in this case cause the RHS to be dispatched first, even
though the token declaration of the LHS _could_ match a 
longer token prefix?  

Thanks,

Pm


Re: &, &&, and backtracking.

2007-09-06 Thread Patrick R. Michaud
On Thu, Sep 06, 2007 at 12:37:37PM -0700, Larry Wall wrote:
> On Thu, Sep 06, 2007 at 01:25:12PM -0500, Patrick R. Michaud wrote:
> : On a somewhat similar question, what happens with a pattern
> : such as
> : 
> : "foobar" ~~ / foo.+? | fooba /
> : 
> : The LHS initially matches "foob", but with backtracking could
> : eventually match "foobar".  Do the longest-token semantics
> : in this case cause the RHS to be dispatched first, even
> : though the token declaration of the LHS _could_ match a 
> : longer token prefix?  
> 
> Yow.  ICATBW.  Non-greedy matching is somewhat antithetical to
> longest-token matching.  

I agree.  One thought I had was that perhaps non-greedy matching
could also terminate the token prefix.

> [...]
> I think longest-token semantics have to trump minimal matching here,
> and my argument is this.  Most uses of *? have additional information
> on what terminates it, either implicitly in what it is matching, or
> explicitly in the next bit of regex.  That is, you'd typically see
> either
> foo\w+? | fooba
> or
> foo.+?  | fooba
> 
> In either case, the clear intent is to match foobar over fooba.
> Therefore I think the DFA matcher just strips ? and does its ordinary
> character by character match, relying on that extra info to match
> the real extent of the quantifier.

Does this still hold true for a non-greedy quantifier in the
middle of an expression... ?  I.e.,

"foobazbar deborah" ~~ /foo .+? b.r | fooba | foobazb /

matches "foobazbar debor" ?

(I completely grant that the examples I'm coming up with here
may be completely nonsensical in real application, but I'm
just exploring the space a bit.)

Pm


Re: [svn:perl6-synopsis] r14449 - doc/trunk/design/syn

2007-09-07 Thread Patrick R. Michaud
On Thu, Sep 06, 2007 at 05:12:03PM -0700, [EMAIL PROTECTED] wrote:
> Log:
> old  is now <+foo> to suppress capture
> new  now is zero-width like 

I really like the change from  to <+foo>, but I think there's
a conflict (or at least some confusion) in the way the new spec is
worded, especially as it relates to character class sets.

Both old and new versions of S05 say:

If the first character after the identifier is whitespace, the
subsequent text (following any whitespace) is passed as a regex, 
so  is more or less equivalent to .

In the previous version of S05, the non-capturing form of 
would be .  Here, the whitespace after "foo" indicated
that "bar" was to be parsed and passed to foo as a regex.

In the new version of S05, the non-capturing form of 
would seem to be <+foo bar>.  Okay, I can handle that.  However, 
S05 also says that "  can be written as <+ foo + bar - baz> ".
Presumably this second form would also allow "<+foo + bar - baz>",
which seems to conflict slightly with the notion that <+foo bar>
is the non-capturing form of .  In other words, the
whitespace character following "<+foo" doesn't seem to be
sufficient to indicate how the remainder is to be processed --
we have to look beyond the whitespace for a leading plus or minus.

Perhaps S05 is addressing this when it says 

An initial identifier is taken as a character class, so the 
first character after the identifier doesn't matter in this 
case, and you can use whitespace however you like.

Here I find this wording very unclear -- it doesn't tell me 
what is distinguishing the "doesn't matter in this case" part
between <+foo + bar> and <+foo bar>.

Since the S05 spec has changed so that all punctuation is meta, 
I'm thinking we may be able to simplify the spec altogether.
Previously the "whitespace following the identifier" was
used to distinguish  from , or 
from .  Since it's now effectively impossible for 
a regex to begin with a bare plus or minus character, we may be
able to alter the "whitespace following identifier" wording such
that  and  are identical.  Perhaps
something like:

  - if the character following the identifier is a left paren,
it's a call


<+foo('bar')>


  - if the character following the identifier is a colon, the rest
of the text (following any whitespace) is passed as a string

 # same as 
<+foo: bar>


  - if the identifier is followed by a plus or minus (with optional
intervening whitespace), it's a set of character classes


  # same thing
<+foo + baz - bar> # also the same

  - anything else following whitespace is a regex to be passed

  # same as 
<+foo bar> # same as <+foo(/bar/)>
 # same as 

Pm


Re: [svn:perl6-synopsis] r14449 - doc/trunk/design/syn

2007-09-07 Thread Patrick R. Michaud
Some other minor notes about the S05.pod update:

> +In particular,  also matches the null string, and  always fails.

Perhaps these should be quoted with "C<< ... >>" so that it's
clear that "" and "" are the tokens?  When looking at the
.pod file I had to think about it a couple of times to make sure
that it wasn't intending C and C.

> +Any atom that is quantified with a minimally match (using the C modifier).

s/minimally/minimal/

> +Greedy quantifiers and characters classes do not terminate a token pattern.

s/characters/character/

Thanks,

Pm



Re: [svn:perl6-synopsis] r14449 - doc/trunk/design/syn

2007-09-07 Thread Patrick R. Michaud
On Fri, Sep 07, 2007, Larry Wall writes:
> If we stick with +, one approach might be to simply disallow whitespace
> in composite character classes.

Of the choices presented thus far, I like this one the best.
Although I did like being able to stick whitespace in the
character classes for readability, such that losing the whitespace
in <+foo - [Jj] > would be a disappointment -- I still like <+foo>
as much as the other alternatives.

Even if we decide that <+foo> isn't the official non-capturing syntax,
we still have the case that <+foo> is effectively a non-capturing
form of .  I sorta liked that we were reducing two syntaxes
for the same thing (  and <+foo> ) down to one, so adding
one back in feels funny.

I do agree that we may be getting a few too many +'s in our
patterns.  However, having just converted several grammars in Parrot 
languages to use the new <+foo> syntax, I was surprised at how 
few there actually were.  And many of the existing cases where 
I had previously used  didn't really change (or need to
change), because they were already zero-width things such as
, , , etc., and I felt it made
more sense to keep the  syntax anyway.

Of the non-<+foo> options given thus far, I like <~foo> and <.foo> 
(in that order).  I don't find ~ all that hard to type -- after 
all, we use the tilde quite frequently in things like Unix's 
"~username" syntax, in Perl 5's =~ operator, and even in Perl 
6 with the ~~ smart match operator.  Perhaps I would feel 
differently about tilde if I were on a non-US keyboard.

I agree that <:foo> should probably be reserved for something
having to do with pairs or adverbs.

I'm not at all a fan of <\ws>.

Anyway, those are my reactions, for whatever they're worth.

Pm


Re: [svn:perl6-synopsis] r14449 - doc/trunk/design/syn

2007-09-07 Thread Patrick R. Michaud
On Fri, Sep 07, 2007 at 04:05:55PM -0600, Paul Seamons wrote:
> I'd vote for <:ws> which is vaguely reminiscent of the former non-capturing 
> parens (?:).
> 
> It (<:ws>) also bears little similarity to any other regex construct - 
> although it looks a bit like a Perl 6 pair.

For completeness it may be worth pointing out that :i, :s, and :Perl5
are in fact valid regex constructs.  :-)

Pm


languages/perl6 doesn't run (was: xml and perl 6)

2007-11-28 Thread Patrick R. Michaud
On Wed, Nov 28, 2007 at 07:42:29PM +0100, James Fuller wrote:
> in the meantime, I have yet to get latest trunk perl6 running
> properly, on parrot, or freebsd then I will start thinking of such a
> task (everything compiles fine).  as an aside I am getting an;
> 
> "load_bytecode" couldn't find file 'Protoobject.pbc'
> current instr.: 'parrot;PGE::Match;__onload' pc 0
> (compilers/pge/PGE/Match.pir:14)
> called from Sub 'parrot;Perl6::Compiler;__onload' pc 0 (perl6.pir:30)
> called from Sub 'parrot;Perl6::Compiler;main' pc -1 ((unknown file):-1)

Interesting -- it looks as though the Protoobject.pbc file
isn't being built on your system for some reason.  Perhaps do
a "make realclean" and rebuild, so that the Makefiles are
updated?

If that doesn't resolve it, perhaps you could file a ticket
at <[EMAIL PROTECTED]> and we can follow up there.
(There's also a <[EMAIL PROTECTED]> list, but I think this
particular issue is more likely to be a Parrot problem than a
perl6 one.)

Thanks!

Pm


Re: xml and perl 6

2007-11-29 Thread Patrick R. Michaud
On Thu, Nov 29, 2007 at 10:20:00AM -0500, Mark J. Reed wrote:
> The module could even, I suppose, insert a filter into the compiler so
> that your proposed literal syntax would work, but I don't really see
> the advantage of that over this:
> 
> my $doc = Document.new(< here
> END

Or even:

my $doc = Document.new;
$doc = 'here';

Pm


Re: perl 6 grammar

2007-12-03 Thread Patrick R. Michaud
On Mon, Dec 03, 2007 at 12:20:02PM +, Smylers wrote:
> cdumont writes:
> > I don't really think using the column in a ternary means that you
> > cannot use it else where.
> 
> We started off with that, and it was changed specifically because it was
> causing a problem; I can't remember exactly what, but it's in this
> list's archives somewhere.
> 
> Remember that whatever expression you want to use the colon for is going
> to be valid between the ? and : parts of the ? ... : operator, and so
> you need to avoid the colon being confused for the : which marks the end
> of this part of the ? ... : operator.

...and it's not just the colon, but the ? also has the potential to be
confusing here, because there's a prefix: operator that is used to
coerce into boolean context.

Which indirectly gets around to an even stronger reason for using
C over C -- Perl 6 aims for a consistency in the
use of the ? and ! characters to mean "boolean true" and "boolean
not true".  This is true not only for the operators, but also in
regular expressions and other places.  So, having something like

$foo =  $cond ?? ...if_true... !! ...if_not_true... ;

achieves several important goals:
  - it frees up the ? and : characters for other purposes
  - it reinforces the convention of ? as "if true" and ! as "if false"
  - it is more visually distinctive, so that the ternary tokens don't
get lost in the middle of other operands and expressions
  - it simplifies parsing (both compiler and human) and improves
error reporting

In my case, I've found the switch to ?? !! to be fairly
natural, and that I don't use it often enough to worry about
the extra characters.

> > As for the functions, i didn't see that much for hashes and arrays
> > which was a big disappointment.
> 
> What were you hoping for?  Many things which were functions in Perl 5
> are now also available as methods in Perl 6.  If you post here with what
> you're disappointed to be missing, it may be that somebody can reply
> pointing out where the equivalent functionality is!

As noted at the beginning of Synopsis 1:

Another assumption has been that if we don't talk about 
something in these Synopses, it's the same as it is in Perl 5.

Pm


Re: Concerns about "{...code...}"

2007-12-20 Thread Patrick R. Michaud
On Thu, Dec 20, 2007 at 11:35:44AM -0600, Jonathan Scott Duff wrote:
> On Thu, Dec 20, 2007 at 11:23:05AM -0600, Jonathan Scott Duff wrote:
> > Adriano answered #1 I think:  $yaml = Q:!c"{ $key: 42 }";
> 
> Er, I just looked over the spec again and realized that Q does
> absolutely no interpolation, so it would be more like this:
> 
> $yaml = Q:qq:!c"{ $key: 42 }";
> 
> or perhaps
> 
> $yaml = qq:!c"{ $key: 42 }";

There's also

$yaml = qs "{ $key: 42 }";

This form also makes it easier to deal with special characters,
such as quoted yaml values, as in

$yaml = qs /{ $key: "$value" }/;

which interpolates $key and $value but leaves the curlies and
quotation marks alone.

Just to add another perspective, PHP uses curlies inside of
double-quoted strings to indicate various forms of 
interpolation, and it doesn't seem to cause major issues
there.  But perhaps it's less frequent that PHP apps need
to put curlies in double-quoted strings.  Still, given the
very few times I've had to do this, I've never found it
overly onerous to escape the leading curly the few times I've
needed it.

Pm


Re: Concerns about "{...code...}"

2007-12-20 Thread Patrick R. Michaud
On Thu, Dec 20, 2007 at 06:01:53PM -0500, Mark J. Reed wrote:
>On Dec 20, 2007 4:30 PM, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> 
>  Just to add another perspective, PHP uses curlies inside of
>  double-quoted strings to indicate various forms of
>  interpolation, and it doesn't seem to cause major issues
>  there. 
> 
>But PHP's use of curlies is limited and context-sensitive; it's triggered
>by the sequence {$ or ${.  Bare curlies don't do anything.  

Ah yes, good point.  I thus withdraw my PHP comment, and we're
left with the examples in S02.  

It could be said that closure interpolation would be off by 
default, and enabled using the :c adverb or the C quoter 
that is already part of the spec.  Then we would have

"These { curlies } aren't interpolative."
qc "These { 'curl' ~ 'ies' } are."

I don't have a strong opinion one way or another -- I'm just
trying to point out some alternatives and things the current
spec already offers.  But perhaps this is all a reminder as to
why I try to stay out of the language design forum.

Pm


Re: calling parrot from perl6

2008-01-01 Thread Patrick R. Michaud
On Mon, Dec 31, 2007 at 11:17:53PM +0300, Richard Hainsworth wrote:
> Not sure whether this should be p6-lan or p6-users. Posted to p6l only.

Since the question is specific to perl6 and Parrot, it probably
belongs on perl6-compiler.  But I'll answer it here for now,
as it may spark a language related discussion.

> Given a function implemented in parrot, how can it be called from a 
> perl6 program?
> 
> Suppose I have a file (in current path) 'myfun.pir' which contains
> 
> .sub myfun
>.param pmc passed_variable
>.local int an_int
>an_int = passed_variable[1]
>.local string string_var
> #code
>.return (string_var)
>   end# is this necessary?
> .end
> 
> how do I create a mymodule.pm so that I can in a perl6 program do
> 
> use mymodule;
> 
> my $parameter = 30;
> my $string_var = myfun($parameter);
> 
> ???
> 
> If this is documented, please just send a pointer.

It's not document yet.  As far as calling the function is concerned,
that part already exists -- perl6 will correctly locate the
'myfun' sub written in PIR and call it with the appropriate 
arguments.

The part we don't have yet is dynamic loading of pir files from
perl6.  In other words, perl6 currently treats the statement
"use mymodule;" as being a request for "mymodule.pm", which it
expects to be a Perl 6 source file.

My best guess at this point is that we could have something like:

use Parrot;
Parrot::load_bytecode('myfun.pir');

For the time being we could make the "use Parrot;" step automatic
to the perl6 compiler, such that one could do:

Parrot::load_bytecode('myfun.pir');
my $parameter = 30;
my $string_var = myfun($parameter);

Then it's just a matter of deciding what sort of functions/interface
we expect the Parrot module to have.

Pm


Re: [svn:perl6-synopsis] r14491 - doc/trunk/design/syn

2008-01-17 Thread Patrick R. Michaud
On Thu, Jan 17, 2008 at 01:18:32PM -0800, [EMAIL PROTECTED] wrote:
> +=item *
> +
> +The definition of C<.true> for the most ancestral type (that is, the
> +C type) is equivalent to C<.defined>.  

Would we normally consider prefix: to be defined in terms of
C<.true>, or vice versa?  Is there a prefix:, or is C
treated like an 'is export' trait on '.true' method?

Is there also a C<.not> method?

Pm



Re: pluralization idea that keeps bugging me

2008-01-26 Thread Patrick R. Michaud
On Sat, Jan 26, 2008 at 08:58:43AM -0800, Larry Wall wrote:
> After a recent exchange on PerlMonks about join, I've been thinking
> about the problem of pluralization in interpolated strings, where we
> get things like:
> 
> say "Received $m message{ 1==$m ?? '' !! 's' }."
> 
> My first thought is that this is such a common idiom that we ought
> to have some syntactic sugar for it:
> 
> say "Received $m message\s."
>
> [...]
>
> Any other cute ideas?  

FWIW, this sounds to me a lot like a special quoting operator or
adverbial form.

say qq:pluralized "Received $m message\s".

Pm


Parrot 0.5.3 "Way of the Parrot" released!

2008-02-20 Thread Patrick R. Michaud
On behalf of the Parrot team, I'm proud to announce Parrot 0.5.3
"Way of the Parrot." Parrot (http://parrotcode.org/) is a virtual 
machine aimed at running all dynamic languages.

Parrot 0.5.3 can be obtained via CPAN (soon), or follow the
download instructions at http://parrotcode.org/source.html.
For those who would like to develop on Parrot, or help develop
Parrot itself, we recommend using Subversion or SVK on the
source code repository to get the latest and best Parrot code.

Parrot 0.5.3 highlights:

The Perl 6 on Parrot compiler has now been given the name
"Rakudo Perl".  More details on the new name are available
from http://use.perl.org/~pmichaud/journal/35400 .  In addition,
Rakudo now has more support for objects, classes, roles, etc.,
and a better interface to the official Perl 6 test suite.

More languages are being converted to use the Parrot Compiler
Toolkit.

Parrot 0.5.3 News:
- Documentation
  + PDD09 (garbage collection) - approved
  + PDD28 (character sets) - draft started
  + added function documentation to some core functions
  + PCT beginners guide, optable guide and PAST nodes guide, bug fixes
- Compilers
  + IMCC: plugged various memory leaks and other cleanups
  + PCT:
. add "attribute" as a scope variant to PAST::Var nodes
. add 'shift' and 'pop' methods to PAST:: nodes
  + NQP: add '=:=' op, tests for scalar and list contextualizers, \x escapes
- Languages
  + APL: reimplementation with PCT
  + Cardinal (Ruby): reimplemention with PCT
  + Ecmascript: reimplementation with PCT
  + lolcode: improved expression parsing, ifthen, IT, YARN
  + lua:
. aligned with Lua official release 5.1.3.
. added initial PCT-based implementation.
  + Punie (Perl 1): refactor to use standard PCT-based filenames
  + Pynie (Python): add functions
  + Rakudo (Perl 6):
. rebranded, formerly known as 'perl6'
. passes many more official Perl 6 Specification tests
. added 'perl6doc' utility
. oo including meta?classes, objects, methods, attributes, role composition
. match variables, while/until statements, traits
. many new methods for Str, List, Hash, Junction
- Implementation
- Deprecations
  + PCCINVOKE syntax for named arguments using []; use () instead.
  + see DEPRECATED.pod for details
- Miscellaneous
  + pbc_to_exe refactored for code reduction, portability, and maintainability
  + various bug fixes
  + #line directives added to generated JIT files, improving debugging
  + consting, attribute marking, refactoring, warnings cleanup

The next scheduled Parrot release will be on March 18, 2008.

Thanks to all our contributors for making this possible, and our
sponsors for supporting this project.

Enjoy!



Typo in S06?

2008-03-29 Thread Patrick R. Michaud
S06.pod says (line 2698):

: Ordinarily a top-level Perl "script" just evaluates its anonymous
: mainline code and exits.  During the mainline code, the program's
: arguments are available in raw form from the C<@ARGS> array.  At the end of
: the mainline code, however, a C subroutine will be called with
: whatever command-line arguments remain in C<@ARGS>.  This call is
: performed if and only if:

Should these be C<@*ARGS> instead?

Pm


Re: Query re: duction and precedence.

2008-03-30 Thread Patrick R. Michaud
On Sun, Mar 30, 2008 at 08:21:39AM -0700, Mark A. Biggar wrote:
> The reduce meta-operator over - in APL gives alternating sum, similarly 
> alternating quotient for /, which only works if you right associate 
> things.
> 
> [-] 1,2,3,4,5,6 => 1-2+3-4+5-6 # pseudo-apl
> 
> [/] 1,2,3,4,5,6 => (1*3*5)/(2*4*6) #pseudo-apl
> 
> note that would break the perl 6 simple rule that [-] 1,2,3 => 1-2-3, 
> but gives something much more useful.  There currently is no easy way to 
> do alternating sum/quotient in perl6.

How about...?

# alternating sum of elements in @list
$altsum = [+] ({ $^a - $^b } for @list);

Pm


Re: Getting Started - What to try?

2008-03-31 Thread Patrick R. Michaud
On Mon, Mar 31, 2008 at 10:23:45AM +0200, Moritz Lenz wrote:
> John M. Dlugosz wrote:
> > I understand the most official grammar is being developed there.
> 
> Not quite. The "official" grammar is in the pugs repo in src/perl6/, but
> it can't really run on anything yet.

This is correct -- the "official" grammar (STD.pm) is in pugs.
However, Rakudo's grammar follows STD.pm as closely as it can and
is reasonably close for most things.  We'd definitely be interested
to know about any places it's falling short.

> > Is there a grammar-checker tool that will help me to validate 
> > proposed Perl 6 code fragments, even if I can't execute it yet?
> 
> Not yet, but we hope to build one soon.

OTOH, Rakudo can be used to check syntax (at least as much as
it knows about) by using the --target=parse option.

$ parrot perl6.pbc --target=parse [file]

And we'd be really happy to try out any code fragments you may
have and see if we can get them to parse and execute.  We like
it when we can get something new to work.  :-)  For example, 
the infix // operator and @*ARGS were added to Rakudo in direct 
response to Aaron Trevena wanting to get the Towers of Hanoi 
example [1] running.

Specific questions or problems with Rakudo should probably
go to [EMAIL PROTECTED], or find a Rakudo developer on
#parrot or #perl6.  And bug reports can be filed at
<[EMAIL PROTECTED]>.

Thanks!

Pm

[1]  http://www.perlfoundation.org/perl6/index.cgi?tower_of_hanoi


Re: question on max | min op

2008-04-01 Thread Patrick R. Michaud
On Tue, Apr 01, 2008 at 05:39:36AM -0400, Mark J. Reed wrote:
> On Tue, Apr 1, 2008 at 1:44 AM, Xiao Yafeng <[EMAIL PROTECTED]> wrote:
> > I've read Synopsis and I wondered why to treat max and min as
> >  operator. IMHO, view them as list functions is more reasonable. Like
> >  below:
> >
> >  @test.max
> 
> Which is how you would probably call it in Perl6.  Or else
> 
> max(@test)
> >
> >  is clearer than
> >
> >  @test[0] max @test[1]  or [max] @test.
> 
> Which is not legal Perl6. "max" and "min" may be called "operators",
> but that doesn't mean they're INFIX operator.  

"min" and "max" are infix operators in Perl 6.  From Synopsis 3:

: * Minimum and maximum
:
: $min0 min $min1
: $max0 max $max1

I think they're defined as operators because of some of the
other features one can get from it, beyond just the [max] reduction:

$c = $a max $b;  # versus $c = ($a, $b).max;

$d max= $e;  # versus $d = ($d, $e).max;

@c = @a »max« @b;# larger element of @a and @b

@e = @a »max» 100;   # each element is at least 100

Pm


Re: STD.pm

2008-04-05 Thread Patrick R. Michaud
On Sat, Apr 05, 2008 at 07:59:36PM -, John M. Dlugosz wrote:
> I'm trying to fathom STD.pm.
> 
> Maybe someone can help me trace through this one?  
> 
> How is
> $obj!privA = 1;
> parsed?
> 
> Reading expect_term, it trys , then  sees the 
> "$" and commits to the decision, reads "obj" as a , 
> then checks for a ".", but doesn't have similar logic for "!".  

I'm not sure what you mean by "then checks for a '.'" -- I don't
the dot in the  rule isn't the same as the method dot.

I think the way it parses is that we get $obj from the 
 =>  sequence at the beginning of ,
and then !privA is parsed via the  => dotty:sym sequence.

So the parse tree looks something like:

 = {
   = {
 = {
   # "$"
 # "obj"
}
  }
   = [
[0] = {
   = {
 # "!"
 = {  
# "privA"
}
  }
}
  ]
}

Pm


Re: STD.pm

2008-04-05 Thread Patrick R. Michaud
On Sat, Apr 05, 2008 at 05:32:27PM -0500, Patrick R. Michaud wrote:
> On Sat, Apr 05, 2008 at 07:59:36PM -, John M. Dlugosz wrote:
> > I'm trying to fathom STD.pm.
> > 
> > Maybe someone can help me trace through this one?  
> > 
> > How is
> > $obj!privA = 1;
> > parsed?
> > 
> > Reading expect_term, it trys , then  sees the 
> > "$" and commits to the decision, reads "obj" as a , 
> > then checks for a ".", but doesn't have similar logic for "!".  
> 
> I'm not sure what you mean by "then checks for a '.'" -- I don't
> the dot in the  rule isn't the same as the method dot.

Erg, I mis-edited this.  The dot that appears in the 
rule isn't the normal method dot -- I think it handles things like
C< $.( ... ) > and C< @.( ... ) >.

Pm


Re: syntax question on parameter lists

2008-04-10 Thread Patrick R. Michaud
On Thu, Apr 10, 2008 at 09:18:38PM -0700, Larry Wall wrote:
> On Fri, Apr 11, 2008 at 03:26:02AM -, John M. Dlugosz wrote:
> : S06 shows how to define named-only parameters, "marked with a prefix :".  
> But no example shows anything more than a bare parameter name.  No type is 
> ever given!
> : 
> : Looking through my copy of STD.pm, I'm baffled, as it seems not to take 
> types in parameter lists at all.
> 
> It's at the top of token parameter where there is a *.

Yes, but where does  resolve down to a typename?
My reading of STD.pm is that  becomes a  
(since it's not a 'where' clause in this case), and  is currently
one of , , or .

Pm


Re: Chained Comparisons ?

2008-04-16 Thread Patrick R. Michaud
On Wed, Apr 16, 2008 at 07:49:48AM -, John M. Dlugosz wrote:
> I know how comparisons are chained in Perl 6.  There is a very 
> short section on it in S03.
> 
> So, are the operators infix:{'<'} etc. written in the normal 
> way to take two arguments?  Then the language transforms 
> A op B op C into A op B AND B op C on an innate level.  Does 
> that apply to any user-defined operator with those names?  

It applies to any operator that has 'chain' associativity --
see S06, "Subroutine traits".

> If I want to make my own chained operator, perhaps the 
> curvy ≼, ≽, etc. or make my operator ≧ 
> a synonym for >=, how would I tell the compiler that they 
> belong to the same set of chained operators?

sub infix:«≽» ($a, $b) is equiv(&infix:«>=») { ... }

Or, if you want to create your own chained precedence level
separate from the existing relational ops, 

sub infix:«≽» ($a, $b) is assoc is looser(...)  { ... }

Pm


Re: Chained Comparisons ?

2008-04-17 Thread Patrick R. Michaud
On Wed, Apr 16, 2008 at 11:19:33PM -0400, Bob Rogers wrote:
> Pardon a lurker, but I'm not sure I understand the point of this.  In:
> 
>   if $x < $y < $z { ... }
> 
> I would expect a sensible compiler short-circuit the "$x < $y" part, and
> indeed the "Chained comparisons" section of S03 (version 135) says
> 
>   A chain of comparisons short-circuits if the first comparison
>   fails . . .
> 
> But the definition of chaining associativity under "Operator precedence"
> says this is equivalent to:
> 
>   if ($x < $y) and ($y < $z) { ... }
> 
> (modulo multiple evaluation), but IIUC "and" is not short-circuiting.

"and" is short-circuiting.

>And wouldn't it also be helpful to implement chaining in such a way
> that a specialized chained op implementation couldn't mess it up by
> returning plain True?

FWIW, PCT and Rakudo do it this way -- the chained op returns a true/false
value and doesn't have to be aware of any chaining taking place.

Pm


Re: Compile-time checking of assignment to read-only variables (Re: MMD distances)

2008-05-09 Thread Patrick R. Michaud
On Fri, May 09, 2008 at 03:02:28PM +0200, Carl Mäsak wrote:
> TSa (>):
> > sub bar ($x)
> > {
> >$x = 3;   # error, $x is readonly
> >foo($x);  # error, could hit rw Str
> > }
> 
> By the way, I hope it's possible to make the assignment `$x = 3` to
> the read-only variable $x a compile-time error.
> 
> In fact, I hope this to such a degree that I would like it to be part
> of a spec somewhere that a conforming Perl 6 compiler disallows
> assignments to read-only variables. I find nothing to this effect in
> S04 (but my grep-fu is imperfect, so I may just have missed it).
> 
> Pugs currently dies with a run-time error on this. Rakudo r27392 runs
> it fine and sets $x = 3 as if $x wasn't read-only.

In Rakudo's case, we just haven't implemented read-only traits
on variables yet.  But yes, I expect that it will be caught as
a compile-time error.

Pm


Re: Compile-time checking of assignment to read-only variables (Re: MMD distances)

2008-05-09 Thread Patrick R. Michaud
On Fri, May 09, 2008 at 05:09:31PM +0200, Carl Mäsak wrote:
> Pm (>):
> > In Rakudo's case, we just haven't implemented read-only traits
> > on variables yet.
> 
> Goodie. I guessed as much.
> 
> >  But yes, I expect that it will be caught as
> > a compile-time error.
> 
> And do you agree it's reasonable to expect this of every compiler?

Reasonable to expect it, yes -- but whether or not this rises to the
level of being a "requirement in the spec" may be a different matter.

I could envision the possibility that some otherwise-very-capable
Perl 6 implementation might be better served by having such checks
performed at runtime (they have to be done there also) and leaving
compile-time checking as an optimization.  I suspect this is what
Pugs did.  Or an implementation might not have a clear-cut notion
of "compile time".

So, as long as the assignment is properly prevented, I think that
may be sufficient.  (If the language designers decide otherwise,
that's okay with me too. :-)

Pm


possible clarification of item(), list(), etc.

2008-06-21 Thread Patrick R. Michaud
I think we need a slight wording improvement in S03.  Currently S03:1772
says that the C contextualizer is equivalent to C<@()>.  
However, S05:2328 also says that C<@()> is a shorthand for C<@($/)>.

Taken together, these would seem to imply that C is equivalent
to C<@($/)>, which I suspect is not the case.  (I would expect
C to return an empty List.)

I'm guessing that S03 should be clarified to say something like the
list contextualizer is equivalent to C<@(...)>, to make it clearer(?)
that it's the form that expects an argument.

If the above is correct for C, then similar arguments can
likely be made for C / C<$()> and C / C<%()>.

On a similar vein, is C a named unary?  In other words, is
C< item $a, $b >  equivalent to C< item($a), $b >  or 
C< item($a,$b) > ?

Thanks!

Pm


.join on Array

2008-06-26 Thread Patrick R. Michaud
Following up to a thread on p6c regarding method fallbacks and .join:

* What should [1,3,5].join('-')  produce?

* How about ([1,3,5], 20).join('-')  ?

Thanks!

Pm


Re: Rakudo test miscellanea

2008-06-26 Thread Patrick R. Michaud
On Thu, Jun 26, 2008 at 10:40:53AM -0400, Trey Harris wrote:
> In a message dated Thu, 26 Jun 2008, Moritz Lenz writes:
> >I assume that 'Num' is meant to be a non-complex.
> >Then it seems to make sense to assume:
> >Int is Rat
> >Rat is Num
> >Num is Complex
> >or am I off again?
> 
> S29 seems to have been assuming this, if I'm reading the multis correctly.

Keep in mind that some of S29's assumptions regarding types may no 
longer be true, especially since we've decided that many of the 
builtin methods and functions will now go in "Any" (e.g., C).

Pm


Should C and C work in C ?

2008-06-29 Thread Patrick R. Michaud
Do C and C act like the C method, in that
they work for C object and not just objects of type C?

In other words,, should  C< $x.grep(...) >  work even if
$x isn't normally a list type?

Pm


Re: Should C and C work in C ?

2008-06-30 Thread Patrick R. Michaud
On Mon, Jun 30, 2008 at 01:43:11PM +0200, Moritz Lenz wrote:
> Ovid wrote:
> > --- On Sun, 29/6/08, Patrick R. Michaud <[EMAIL PROTECTED]> wrote:
> > 
> >> Do C and C act like the
> >> C method, in that
> >> they work for C object and not just objects of
> >> type C?
> >> 
> >> In other words,, should  C< $x.grep(...) >  work even
> >> if $x isn't normally a list type?
> > 
> > If I understand you correctly, I think you're asking if grep and map can be 
> > applied to junctions?  
> 
> I think Patrick meant something else.
> 
> The other day we had the discussion what $x.join($sep) should be,
> specifically if it should work for non-List $x. $Larry said yes, it
> should work, and the way to achieve that is to use Any.join.
> Now Patrick wants to know which of the various list methods need to be
> in Any.

Moritz is correct -- in order to get ('foo').join(':') to work as
people will expect, it was decided to define "universal" methods
in the Any class as part of the prelude [1].

So my question is really whether or not we consider grep and 
reverse to be universal methods in this sense also, so that
C< $x.grep(...) > and C< $x.reverse > will work even if $x 
isn't a value that normally does list-type operations.  

I'm suspecting that the answer is "yes, they are universal",
but wanted to confirm it.

Thanks!

Pm


Re: Should C and C work in C ?

2008-06-30 Thread Patrick R. Michaud
On Mon, Jun 30, 2008 at 07:25:11AM -0500, Patrick R. Michaud wrote:
> Moritz is correct -- in order to get ('foo').join(':') to work as
> people will expect, it was decided to define "universal" methods
> in the Any class as part of the prelude [1].

I forgot to include the reference link:

1.  http://groups.google.com/group/perl.perl6.compiler/msg/acf1cfbb16b998cf

Pm


Re: Should C and C work in C ?

2008-07-01 Thread Patrick R. Michaud
On Tue, Jul 01, 2008 at 05:36:26PM +0200, TSa wrote:
> This would save lots of overloads in Any in favor of a handful of
> standard coercions. These need proper anchorage in the dispatch
> system, of course. That to me means we need some definition of
> "conversion quality" and "conversion distance".

So far in Rakudo we haven't done any "overloads in Any" --
what has happened instead is that the methods in question have
simply moved into the Any class and out of whatever class they
were in previously.

Pm



Re: Interrogating signatures

2008-07-08 Thread Patrick R. Michaud
On Tue, Jul 08, 2008 at 12:47:57PM +0200, Jonathan Worthington wrote:
> Hi,
> 
> Is there an introspection interface for signatures defined anywhere? 
> I've looked through the synopses and don't see one. I'm thinking things 
> like:
> 
> * Can you do .arity and .count of a signature?
> * Can you iterate over a signature to get each element in there?
> * If so, what sort thing thingy to you get to describe each element? 
> Some kind of parameter descriptor?

http://dev.perl.org/perl6/doc/design/syn/S06.html#The_want_function
describes C<.arity> and C<.count> .   I don't know about the others
yet.

Pm


Question about .sort and .reduce

2008-07-11 Thread Patrick R. Michaud

t/spec/S29-list/sort.t has the following test:

my @a = (2, 45, 6, 1, 3);
my @e = (1, 2, 3, 6, 45);
my @s = { $^a <=> $^b }.sort: @a;
is(@s, @e, '... with closure as direct invocant');

S29 doesn't show a 'sort' method defined on block/closure
invocants... should there be?  

Note that we already have:

my @s = sort { $^a <=> $^b }, @a;
my @s = @a.sort { $^a <=> $^b };

A similar question applies for .reduce in S29-list/reduce.t :

is(({ $^a * $^b }.reduce: 1,2,3,4,5), 120, "basic reduce works (3)");

Thanks!

Pm


Re: Question about .sort and .reduce

2008-07-11 Thread Patrick R. Michaud
On Fri, Jul 11, 2008 at 03:27:26PM +0200, TSa wrote:
> >Note that we already have:
> >
> >my @s = sort { $^a <=> $^b }, @a;
> >my @s = @a.sort { $^a <=> $^b };
> 
> Is that the adverbial block syntax? If not how
> would it look?

The adverbial block syntax would be:

@a.sort:{ $^a <=> $^b };
sort(@a) :{ $^a <=> $^b };

I'm not entirely certain if any of the following 
examples with adverbial blocks would also work.  I'm guessing
they do, but could use confirmation.

sort @a, :{ $^a <=> $^b };
sort @a :{ $^a <=> $^b };
sort :{ $^a <=> $^b }, @a;
@a.sort: :{ $^a <=> $^b };

Pm


S04-related closure question

2008-07-12 Thread Patrick R. Michaud
What would be the expected output from the following?

my $a = foo();
my $b;

{
my $x = 1;
sub get_x() { return $x; }
sub foo()   { return &get_x; }
$b = foo();
}

my $c = foo();

say "a: ", $a();
say "b: ", $b();
say "c: ", $c();

As a followup question, what about...?

my @array;
for 1..3 -> $x {
sub get_x() { return $x; }
push @array, &get_x;
}

for @array -> $f { say $f(); }

Pm


Re: Quick question: (...) vs [...]

2008-08-09 Thread Patrick R. Michaud
On Fri, Aug 08, 2008 at 11:08:51PM -0400, Brandon S. Allbery KF8NH wrote:
>
> On 2008 Aug 8, at 22:53, John M. Dlugosz wrote:
>
>> What is the difference between (1,2,3) and [1,2,3] ?
>
> IIRC one is a list, the other a reference to a list --- which in perl6  
> will be hidden for the most part. so practically speaking the difference 
> is minimal.

More directly, (1,2,3) will interpolate in list context, while
[1,2,3] will not.

say (1, 2, (3, 4, 5)).elems # 5
say (1, 2, [3, 4, 5]).elems # 3

The first example has a List containing five Ints, the second example
has a List containing two Ints and an Array.

It's also useful to consider the difference between:

$x = (3); # $x becomes an Int
$x = [3]; # $x becomes an Array

Pm


Re: Closure vs Hash Parsing

2008-08-09 Thread Patrick R. Michaud
On Fri, Aug 08, 2008 at 07:32:52AM +0200, Carl Mäsak wrote:
> Jonathan (>):
> > That this means the { $_ => uc $_; } above would end up composing a Hash
> > object (unless the semicolon is meant to throw a spanner in the
> > hash-composer works?) It says you can use sub to disambiguate, but
> >
> > %ret = map sub { $_ => uc $_; }, split "", $text;
> >
> > Doesn't work since $_ isn't an automatic parameter for a sub, like it would
> > be in just a block (in the implementation, and if I understand correctly in
> > the spec too).
> 
> Out of curiosity, would this work?
> 
> %ret = map -> { $_ => uc $_; }, split "", $text;

A pointy block with nothing after the arrow is a sub with zero params,
so no, this wouldn't work.  One would need something like

%reg = map -> $_ { $_ => uc $_; }, split "", $text;

> Or this?
> 
> %ret = map { $^foo => uc $^foo; }, split "", $text;

I'm thinking S04 probably needs some clarification/updating here.
Any block that contains a (placeholder) parameter probably needs
to remain a sub, even if the block content is a comma-separated list 
starting with a pair/hash.

Pm


arrayref/hashref in spectest suite

2008-08-18 Thread Patrick R. Michaud
There are quite a few tests in the spectest suite that
make mention of "arrayref" and "hashref", and that expect
things to work like references do in Perl 5.  I'd like to
get some confirmation/clarification on them.

Here's one example:

my $foo = [ 42 ];
my $bar = { a => 23 };
$foo[1] = $bar;
$bar = 24;

say $foo[1]; #  "24" or undef ???

The test suite expects "24" to be output here, treating
treating C< $foo[1] > as a reference to the hash in
C<$bar>, such that any changes to C<$bar> are also reflected
in C<$foo[1]>.  Is this correct Perl 6?  I would somewhat expect
a reference to be instead handled using a statement like 

$foo[1] := $bar;

Comments and clarifications appreciated.

Pm


Re: whats wrong with this code?

2008-08-22 Thread Patrick R. Michaud
On Fri, Aug 22, 2008 at 04:34:04PM -0500, Andy Colson wrote:
> sub xsum (@list)
> {
> my $i = 0;
> print "summing: ";
> for @list
> {
> $i += $_;
> print $_,",";
> }
> say " = $i";
> return $i;
> }
> say "sum = ", xsum( (1,2,3,4,5) );
>
> It returns this:
>
> summing: 1 2 3 4 5, = -1.2289e+09
> sum = -1.2289e+09

I suspect that Rakudo is having trouble binding array parameters
at the moment -- so it's likely a bug in the parameter handling code
(which I'm expecting will need some refactoring soon anyway).  I'm
guessing that Rakudo is binding @list as if it is a Scalar Array,
and thus the for loop sees only one element.

This probably deserves a tracking ticket at <[EMAIL PROTECTED]>.

Pm


Re: Does tha capture object $/ retain a live tie to the string it matched?

2008-08-23 Thread Patrick R. Michaud
On Sat, Aug 23, 2008 at 12:55:44PM +0200, Moritz Lenz wrote:
> Carl Mäsak wrote:
> >  # should $/ really keep ties to $s like this?
> >  rakudo: my $s = "hello"; $s ~~ /hello/; $s = "goodbye"; say $/
> >  rakudo 29834: OUTPUT[goodb␤]
> 
> I'm pretty sure it's a bug in rakudo.

It's a bug somewhere, yes.  I suspect that PGE is tying to
the scalar variable itself where it needs to be tying to the
value.

> > The currently defined methods are
> > 
> > $/.from # the initial match position
> > $/.to   # the final match position
> > $/.chars# $/.to - $/.from
> > $/.orig # the original match string
> > $/.text # substr($/.orig, $/.from, $/.chars)
> 
> $/.text seems to be a bit superfluous, because it's already available as
> ~$/ and $/.Str

$/.text and ~$/ are different: $/.text always returns the
matched text, while ~$/ returns the stringification of the
result object (which could be different from the matched text
if C was used inside of the regex).

"81" ~~ / (\d+) { make $0.sqrt } /;

say ~$/; # "9\n"
say $/.text; # "81\n"

(C and closures in regexes are not implemented in Rakudo yet.)

Pm


Re: [perl #58302] [BUG] binary junctions of undefs in boolean context fails (21/37)

2008-08-24 Thread Patrick R. Michaud
On Sun, Aug 24, 2008 at 03:00:54PM -0700, Larry Wall wrote:
> : Question to p6l: do && and || autothread? Or do they collapse the
> : junction prior to evaluation? (I hope the latter, since I think it's
> : more dwimmy).
> : 
> : Also do prefix: and prefix: collapse the junction?
> 
> I think it would be best if all boolean contexts collapse consistently,
> and I would consider all of those to be boolean contexts.  More
> precisely, && and || are boolean on the left, but not on the right.

Yay!  

I'm assuming the same holds true for the conditional expression
in C.

Thanks,

Pm


Re: [perl #58302] [BUG] binary junctions of undefs in boolean context fails (21/37)

2008-08-24 Thread Patrick R. Michaud
On Mon, Aug 25, 2008 at 12:15:05AM +0200, Moritz Lenz wrote:
> Larry Wall wrote:
> > I think it would be best if all boolean contexts collapse consistently,
> > and I would consider all of those to be boolean contexts.  More
> > precisely, && and || are boolean on the left, but not on the right.
> 
> Very good.
> As a follow-up for the testers: should ok() expect an Object as its
> first argument? If so we could say
> 
> ok 1|2, 'Junction 1|2 is true in boolean context';

Keeping with my general philosophy that I'd like to keep
the requirements needed to run Test.pm (and the test suite)
as simple as possible, I'd prefer to not require type checking
within Test.pm in order for it to work right.

Beyond that, if we're testing a Junction in boolean
context, I think I would prefer to make that an explicit
part of the test itself:

ok ?(1|2), 'Junction 1|2 is true in boolean context';

Pm


Re: Speccing Test.pm?

2008-09-02 Thread Patrick R. Michaud
On Tue, Sep 02, 2008 at 02:10:39PM +0200, Moritz Lenz wrote:
> The test suite is considered "official" as in "everything that passes
> the (completed) test suite may name itself Perl 6", and nearly all of
> these files 'use Test'; However we don't ship an "official" Test.pm, nor
> do we define which test functions it should contain and export by
> default, nor their semantics.
> 
> Now this may sound a bit theoretical and far-fetched, but we've actually
> encountered test files that contain tests which are only in Rakudo's
> Test.pm (probably my fault), and otoh there are a few functions in pugs'
> Test.pm that are not used (for example unlike(), which is only used in
> t/02-test-pm/1-basic.t to test unlike()).
> 
> So how should we proceed? Should I assemble a list of commonly used test
> functions and remove all others both in the Test.pm's and the test files?

I'd like to see us spec the list of test functions needed by the
official test suite.  If possible, I'd also like those functions to be 
kept on simple side, so that an implementation doesn't have to
have a nearly complete implementation of Perl 6 in order to start using
the suite.  For example, we shouldn't require advanced typing or
multimethod dispatch semantics in order for Test.pm to work.

> And then? Spec it? Or ship a prototype Test.pm as "official"?

I think it's good to have a prototype Test.pm that we can point to as
a reference, but I don't think we need to try to designate it as being
"official".

Pm


Re: Speccing Test.pm?

2008-09-02 Thread Patrick R. Michaud
On Tue, Sep 02, 2008 at 12:32:49PM -0700, Darren Duncan wrote:
> Patrick R. Michaud wrote:
>> I think it's good to have a prototype Test.pm that we can point to as
>> a reference, but I don't think we need to try to designate it as being
>> "official".
>
> [...]
> 2.  The Perl 6 language spec itself would specify a basic set of test  
> routines built-in to the language, in a Test namespace, much as it 
> defines collections of routines now for such as numbers and arrays and 
> standard I/O.  And so the basic test routines would be formally defined 
> in a Synopsis document.  

I disagree.  The testing we're likely want to do as part of the language
test suite may be substantially different from what we want to provide
to module writers for testing.  In particular, I think that the test
suite harness should require only a minimal Perl 6 implementation
(note I said "harness", not the tests themselves), whereas it's much
more reasonable that a testing system used by module writers could/should
assume a fully working Perl 6 implementation.

It's a difference of "bootstrapping" versus "running environment".

> I also don't see the possibility of our "getting it wrong" in the design 
> to be such a big deal, since the odds are anything we think of now will 
> work well for many years, as Test.pm/Test::More has been fairly stable 
> already and meanwhile Perl 6 is versioned now, so we could make an 
> incompatible change to the Test related language spec in the future, and 
> as long as users say "use Perl-6.0.0" their code relying on the 
> older/current Test.pm like interface won't break.

"Perl 6 is versioned now" is a misnomer.  The *spec* calls for a versioned
Perl 6, but I'm not aware that any of the implementations do much with that.
At any rate, relying on handling multiple versions of Perl 6 to run Test.pm
is exactly one of those things I'd like to avoid in the official test suite.

Pm


Re: Regex repetition controlled by characters

2008-09-05 Thread Patrick R. Michaud
On Sun, Aug 31, 2008 at 08:33:48AM -0600, Stephen Simmons wrote:
> In S05, I found this regarding the generalized repetition specifier:
> 
>  ** '|'# repetition controlled by presence of character
> 
> I tried it out with
> 
> rule thislist {  ** '|' };
> 
> and got (with Rakudo):
> 
> perl6regex parse error: Error in closure quantifier at offset 28, found '''
> 
> Is this feature unsupported at the moment or am I misunderstanding it?

It's unsupported at the moment -- currently this is RT #53100.

Pm


Iterator semantics

2008-09-09 Thread Patrick R. Michaud
I think my question can be best understood by example -- what
does the following produce?

my @a = 1,2,3,4,5;
for @a { .say; @a = (); }

My question is whether the change to @a inside the for loop
affects the iterator created at the beginning of the for loop.
In other words, would the above produce "1\n2\n3\n4\n5\n"  or
"1\n" ?

My followup question is then:

my @a = 1,2,3,4,5;
my @b = 6,7,8;

for @a,@b { .say; @b = (); }

I have more examples involving various aspects of list and
iterator semantics, but the answers to the above will help guide
my questions.

Pm


Re: Deep equivalence test of data structures

2008-09-14 Thread Patrick R. Michaud
On Sun, Sep 14, 2008 at 03:08:57PM +0200, Carl Mäsak wrote:
> Recently, in November, we've had reason to clone the Rakudo Test.pm
> and add an implementation (viklund++) of is_deeply, for testing
> whether two arrays, pairs or hashes are deeply -- recursively --
> equivalent. The method does what you'd think it does, checks the types
> of its parameters and recurses as necessary.
> 
> With the rich set of equality testing operators in Perl 6...
> 
>  
>  
> 
> ...and given constructs like [+] and <+>, it's actually a bit
> surprising to me that testing whether [1, [2, 3]] and [1, [2, 4]] are
> the deeply equivalent isn't more easily expressed than it is. (Or
> maybe it is easy with current constructs, and I missed it? Can't rule
> that out.)
> 
> Couldn't an adverb to one or more of the existing equality operators
> do this nicely? Something like this:
> 
> say [1, [2, 3]] eqv [1, [2, 4]] :deeply;

Doesn't infix: already somewhat imply the "is deeply" semantics,
at least for arrays and hashes?  

As far as current implementation status is concerned, I think
that the t/spec tests have this wrong in many cases -- they seem
to assume that infix: tests object identity for equivalence
instead of comparing values.  For example, t/spec/S29-any/eqv.t has:

  ok !([1,2,3] eqv [4,5,6]), "eqv on anonymous array references (1)";
  #?pugs 2 todo 'bug'
  ok !([1,2,3] eqv [1,2,3]), "eqv on anonymous array references (2)";
  ok !([]  eqv []),  "eqv on anonymous array references (3)";

I think that the last two tests are incorrect, and that
[1,2,3] eqv [1,2,3]  should give a True result.

Pm


How to define a new value type?

2008-09-14 Thread Patrick R. Michaud
In [1], Larry writes:

> [...] we left = in the language
> to provide (to the extent possible) the same semantics that it
> does in Perl 5.  And when it comes to non-value types, there really
> are still references, even if we try not to talk about them much.
> So I think assignment is basically about copying around identities,
> where value types treat identity differently than object types (or
> at least, objects types that aren't pretending to be value types).

So, how does one get an object to pretend to be a value type for
purposes of assignment?  

Currently if I do the following

class Dog { ... }
my $a = Dog.new;
my $b = $a;

then $a and $b both refer to the same Dog object.  How would I
define Dog such that it acts like a value type -- i.e., so that
$b would be a copy of $a and future changes to the object in $a 
don't affect $b.

(Various parts of the synopses talk about .WHICH being used to
define value types, but I don't quite see how that fits in to
assignment.)

Thanks!

Pm


Re: How to define a new value type?

2008-09-14 Thread Patrick R. Michaud
On Sun, Sep 14, 2008 at 09:08:19AM -0500, Patrick R. Michaud wrote:
> In [1], Larry writes:

Oops, I forgot the reference:

1.  http://groups.google.com/group/perl.perl6.language/msg/3f8efc31e4830f42

Pm


Re: Recommended Perl 6 best practices?

2008-09-14 Thread Patrick R. Michaud
On Sun, Sep 14, 2008 at 04:18:44PM +0200, Carl Mäsak wrote:
> Conrad (>):
> > Is there something more up-to-date concerning "Perl 6 best practices" that
> > are presently-recommended (by p6l or @Larry) than the following item on the
> > Perl 6 wiki?
> [...]
> That said, I do have one Perl 6-specific "best practice". I know
> you're looking for a collection, but one's a start. :) Here it is:
> 
> Do not combine 'ne' and '|', like this:
> 
> die "Unrecognized directive: TMPL_$directive"
>if $directive ne 'VAR' | 'LOOP' | 'IF';
> [...]
> The more general advice, then, would be not to use junctions together
> with negated equality operators. Instead, use the non-negated equality
> operator, and negate the whole expression.

This particular case is explicitly mentioned in S03:2529:

Use of negative operators with syntactically recognizable junctions may
produce a warning on code that works differently in English than in Perl.
Instead of writing
if $a != 1 | 2 | 3 {...}
you need to write
if not $a == 1 | 2 | 3 {...}

However, this is only a syntactic warning, and
if $a != $b {...}
will not complain if $b happens to contain a junction at runtime.

We might be able to craft a similar warning in Rakudo, but I'm
curious to see how/where STD.pm will choose to handle this.
(My guess is it will have something to do with 
infix_prefix_meta_operator:, although we also have to have a way
to handle it for the infix: and infix: cases.)

Pm


What should +":2<1a>" produce?

2008-09-14 Thread Patrick R. Michaud
Given that we have

say +'12';# 12
say +'0b1100';# 12
say +'0x0c';  # 12

what should the following produce?

say +':2<1a>';#  0?  Failure?  12?

Pm


Re: Deep equivalence test of data structures

2008-09-14 Thread Patrick R. Michaud
On Sun, Sep 14, 2008 at 01:59:22PM -0700, Michael G Schwern wrote:
> Eric Wilhelm asked me to chime in here.
> 
> is_deeply() is about checking that two structures contain the same values.
> This is different from checking that they're the same *things*, that they are
> in fact the same object or reference.
> 
> You need both.
> [...]

Since it wasn't explicitly mentioned in Schwern's post, I'll add
that Perl 6 uses infix:<===> for checking identity, as in "Are these two
things the same object or reference?"

my $a = [1,2,3];
my $b = [1,2,3];

$a eqv $b # True
$a === $b # False

Pm



Re: Offerings - edits pending

2008-09-15 Thread Patrick R. Michaud
On Mon, Sep 15, 2008 at 06:08:37PM -0500, John M. Dlugosz wrote:
> This is just a reminder that I have files posted at  
>  waiting for someone in  
> authority to inspect and merge.

Would it be worthwhile to provide them as diffs?  That way we
could easily see what is being changed (and is traditionally
the way we have reviewed and applied edits to the Synopses).

Yes, I know one can also download the files and make our own diffs,
but tradition has been to review and apply diffs in the first place.

Pm


  1   2   3   4   >