date:20020423

Arrays of PMCs

2002-04-23 Thread Piers Cawley


Does anyone have an idea of when we're going to see these? Or hashes
of PMCs, I don't really care which...

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

Re: Regex and Matched Delimiters

2002-04-23 Thread Piers Cawley


Larry Wall <[EMAIL PROTECTED]> writes:
> /^pat$/m  /^^pat$$/

$$ is no longer the current PID? Or will we have to call that '${$}'
in a regex?

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

Re: Regex and Matched Delimiters

2002-04-23 Thread Piers Cawley

"Brent Dax" <[EMAIL PROTECTED]> writes:
> Larry Wall:
> That's...odd.  Is $$ (the variable) going away?
>
> # /./s// or /<.>/ ???
>
> I think that . is too common a metacharacter to be relegated to
> this.

I think you failed to notice that '/s' on the regex. In general . will
still mean . but if you want it to match *anything* including a new
line, you have to call it <.>. Personally, I don't have a problem with
that.

> # space(or \h for "horizontal"?)
>
> Same thinking as '.'.

The golfers aren't going to like it for sure. But most of the time
when I'm doing production code I have /x turned on anyway, and in that
context, if I want to match a space and only a space, I have to do [ ]
anyway. 

It might be nice if we could have m:X// mean 'space and hash match
themselves'. 

> # \t  also 
> # \n  also  or  (latter matching
> logical newline)
> # \r  also 
> # \f  also 
> # \a  also 
> # \e  also 
>
> I can tell you right now that these are going to screw people up.
> They'll try to use these in normal strings and be confused when it
> doesn't work.  And you probably won't be able to emit a warning,
> considering how much CGI Perl munches.

But assigning meaning to < and > is going to do that anyway. 

-- 
Piers

   "It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
 -- Jane Austen?

RE: Regex and Matched Delimiters

2002-04-23 Thread Brent Dax


Piers Cawley:
# "Brent Dax" <[EMAIL PROTECTED]> writes:
# > Larry Wall:
# > That's...odd.  Is $$ (the variable) going away?
# >
# > # /./s  // or /<.>/ ???
# >
# > I think that . is too common a metacharacter to be 
# relegated to this.
# 
# I think you failed to notice that '/s' on the regex. In 
# general . will still mean . but if you want it to match 
# *anything* including a new line, you have to call it <.>. 
# Personally, I don't have a problem with that.

Ah, you're right.  My bad.

# > # space  (or \h for "horizontal"?)
# >
# > Same thinking as '.'.
# 
# The golfers aren't going to like it for sure. But most of the 
# time when I'm doing production code I have /x turned on 
# anyway, and in that context, if I want to match a space and 
# only a space, I have to do [ ] anyway. 
# 
# It might be nice if we could have m:X// mean 'space and hash 
# match themselves'. 

I was thinking that  would replace \s.  If that isn't the case, I
have no real complaint (if you can turn off /x).

# > # \talso 
# > # \nalso  or  (latter matching
# > logical newline)
# > # \ralso 
# > # \falso 
# > # \aalso 
# > # \ealso 
# >
# > I can tell you right now that these are going to screw people up. 
# > They'll try to use these in normal strings and be confused when it 
# > doesn't work.  And you probably won't be able to emit a warning, 
# > considering how much CGI Perl munches.
# 
# But assigning meaning to < and > is going to do that anyway. 

Not if the things are meaningless outside of regexes.  For example,
lookahead sequences make absolutely no sense in a quoted string.

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

#define private public
--Spotted in a C++ program just before a #include

Re: Regex and Matched Delimiters

2002-04-23 Thread Ariel Scolnicov


Larry Wall <[EMAIL PROTECTED]> writes:

[...]

> /pat/x/pat/

How do I do a "no /x"?  I know that commented /x'ed regexps are easier
reading (I even write them myself, I swear I do!), but having to
escape whitespace is often very annoying.  Will I really have to
escape all spaces (or use , below)?

This also marks a significant departure from UN*X-style regexps.  One
reason learning Perl's regexp language was so convenient (to me) was
that that most of what I knew of UN*X regexps was applicable.
Changing the behaviour of a rather useful character (like ASCII 32) is
going to produce many references to the FAQ "Why doesn't /a word/
match 'a word'?".  (Having to escape #s is not as bad, as they are
less common).

[...]

-- 
Ariel Scolnicov|http://3w.compugen.co.il/~ariels
Compugen Ltd.  |[EMAIL PROTECTED]
72 Pinhas Rosen St.|Tel: +972-3-7658117  "fast, good, and cheap;
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555   pick any two!"

Re: Regex and Matched Delimiters

2002-04-23 Thread Me


> /pat/i m:i/pat/ or // or even m ???

Why lose the modifier-following-final-delimiter
syntax? Is this to avoid a parsing issue, or
because it's linguistically odd to have a modifier
at the end?


> /^pat$/m /^^pat$$/

What's the mnemonic here? It feels the wrong
way round -- like a single ^ or $ should match
at newlines, double ^ or $ should only match
at start/end string.

Ah. The newline matches between the ^^ or $$.
That works.

Then there's the PID issue. Hmm. How to save $$
(it is nice for one liners)?

Sorry if this is a dumb suggestion, but could you have
just one assertion, say ^$, that alternates matching
just before and just after a newline?


> /./s // or /<.>/ ???

I'd expect . to match newlines by default. For a . that
didn't match newlines, I'd expect to need to use [^\n].


> space  (or \h for "horizontal"?)

Can one quote a substring of a regex? In a later part you
say that \Q...\E is going away, so it seems not. It would be
nice to say something like:

/foo bar baz 'qux waldo' emerson/

and have the space between qux and waldo be literal.
Similar arguments apply more broadly so that one
could escape the usual meaning of metacharacters etc.


> \Lstring\E \L
> \Ustring\E \U

Maybe, if I wasn't too far off with the quote mark
suggestion above, then  \L'string' would be more
natural.


> (?#...) {"..."} :-)

Will plain # comments work in p6  regexen?


> (?:...) <:...>
> (?=...) 
> (?!...) 
> (?<=...) 
> (?
> (?>...) 

Hmm. So <> are clustering just like ().

One difference is that () always capture whereas <>
only do so sometimes. Oh, and {} can too.

() are no longer used for clever stuff, <> are instead.
And {}.

Hmm. Time for bed.


--
ralph

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Aaron Sherman

On Mon, 2002-04-22 at 19:22, Larry Wall wrote:

> Perl 6 will try to avoid synonyms but make it easy to declare them.  At
> worst it would be something like:
> 
> my sub operator:now ($a,$b) is inline { $a but $b }

I see your point, and it makes sense, but how will precedence work? What
would this do:

$i now foo but bar and 2;

or this:

$i but foo now bar and 2;

What if I want to define a synonym for and?

sub operator:also ($a,$b) is inline { $a and $b }
print $_ also die;

Scratching my head here in userville

Re: Regex and Matched Delimiters

2002-04-23 Thread Aaron Sherman

On Mon, 2002-04-22 at 21:53, Larry Wall wrote:

> * Parens always capture.
> * Braces are always closures.
> * Square brackets are always character classes.
> * Angle brackets are always metasyntax (along with backslash).
> 
> So a first whack at the differences might be:
[...]
> space  (or \h for "horizontal"?)
> {n,m} 
> 
> \talso 

I want to know how he does this!! We sit around scratching out heads
looking for a syntax that fits and refines and he jumps in with
something that redefines and simplifies. Larry is wasted on Perl. He
needs to run for office ;-)

> \Lstring\E\L
> \Ustring\E\U

This one boggles me. Wouldn't that be something like:

 or string # ;-)

Seriously, it seems that "\L" would be confusing.

> \Q$var\E  $varalways assumed literal, so $1 is literal backref
> $var  <$var>  assumed to be regex

Very nice. I can get behind this, and a lot of people will thank you who
have to maintain code.

> =~ $re=~ /<$re>/   ouch?

If $re is a regexp, wouldn't "$str =~ $re" turn into "$re.match($str)"?
Perhaps "$re.m $str" which is no more typing and pretty clear to me.

> Obviously the  and  syntaxes will be user extensible.
> We have to be able to support full grammars.  I consider it a feature
> that  looks like a non-terminal in standard BNF notation.  I do
> not consider it a misfeature that  resembles an HTML or XML tag,
> since most of those languages need to be matched with a fancy rule
> named  anyway.

It's too bad that  would be messy with standard Perl //-enclosed
regexes, as it would be a nice way to pass parameters to user-defined
tags. It would also allow XML-like propagation of results:

xyz

RE: Regex and Matched Delimiters

2002-04-23 Thread Luke Palmer


> # =~ $re  =~ /<$re>/   ouch?
> 
> I don't see the win.

Naturally =~ $re is a bit cleaner, but we can't do that because =~ is 
smart match, not regex match.


> # (?=...) 
> # (?!...) 
> # (?<=...)
> # (?
> 
> Cute.  (Wait a minute, aren't those reversed?)

Hehe. I thought that was cool. 

/foobar/
/ foobar/
 
You see, foobar before snafoo, which is what it is.
After snafoo, foobar.

It reads very nicely.



Luke

Re: Regex and Matched Delimiters

2002-04-23 Thread Iain Truskett


* Larry Wall ([EMAIL PROTECTED]) [23 Apr 2002 11:56]:

[...]
> * Parens always capture.

Maybe I missed something in the rest of the details, but is anything
going to replace non-capturing parens? It's just that I do find them
quite useful.

-- 
iain.

Re: Regex and Matched Delimiters

2002-04-23 Thread Luke Palmer


On Wed, 24 Apr 2002, Iain Truskett wrote:

> * Larry Wall ([EMAIL PROTECTED]) [23 Apr 2002 11:56]:
> 
> [...]
> > * Parens always capture.
> 
> Maybe I missed something in the rest of the details, but is anything
> going to replace non-capturing parens? It's just that I do find them
> quite useful.

Yes.

/indeed <:this>+ wont capture/

Re: Regex and Matched Delimiters

2002-04-23 Thread Aaron Sherman

On Tue, 2002-04-23 at 04:32, Ariel Scolnicov wrote:
> Larry Wall <[EMAIL PROTECTED]> writes:
> 
> [...]
> 
> > /pat/x  /pat/
> 
> How do I do a "no /x"?  I know that commented /x'ed regexps are easier
> reading (I even write them myself, I swear I do!), but having to
> escape whitespace is often very annoying.  Will I really have to
> escape all spaces (or use , below)?
> 

I'm not sure that that's a bad thing. Regular expressions are the
hairiest, ugliest thing in Perl. If they change in this way, I see them
getting a tad more verbose, and a whole lot more readable and
maintainable. Besides you can always do this:

$str = "COPYING file for more information";
/$str/

since scalars will be interpolated as quoted by default.

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Larry Wall


Aaron Sherman writes:
: On Mon, 2002-04-22 at 19:22, Larry Wall wrote:
: 
: > Perl 6 will try to avoid synonyms but make it easy to declare them.  At
: > worst it would be something like:
: > 
: > my sub operator:now ($a,$b) is inline { $a but $b }
: 
: I see your point, and it makes sense, but how will precedence work? What
: would this do:
: 
:   $i now foo but bar and 2;
: 
: or this:
: 
:   $i but foo now bar and 2;
: 
: What if I want to define a synonym for and?
: 
: sub operator:also ($a,$b) is inline { $a and $b }
: print $_ also die;
: 
: Scratching my head here in userville

Precedence is set with the "like' property:

my sub operator:now ($a,$b) is like("but") is inline { $a but $b }
sub operator:also ($a,$b) is like("and") is inline { $a and $b }

Larry

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Buddha Buck

At 08:58 AM 04-23-2002 -0700, Larry Wall wrote:
>Precedence is set with the "like' property:
>
> my sub operator:now ($a,$b) is like("but") is inline { $a but $b }
> sub operator:also ($a,$b) is like("and") is inline { $a and $b }

OK, but that limits you to the, um, 24 standard levels of precedence.  What 
do you do if you don't think that that's enough.  Let's say you want to 
define a "nand" operator:

my sub operator:nand ($a, $b) is inline { not ($a and $b) }

but you want nand to have a precedence lower than the existing 'and' but 
higher than the existing 'or' (for some reason I can't imagine 
offhand).  It isn't like() anything, since there isn't anything currently 
between 'and' and 'or'.  Would that be something like:

my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b) }

Re: Regex and Matched Delimiters

2002-04-23 Thread Larry Wall


Brent Dax writes:
: # ?pat?   // or even m ???
: 
: Whoa, those are moving to the front?!?

The problem with options in general is that they can't easily modify
parsing if they come in back.  Now in the particular case of /f and /i,
it probably doesn't matter.  But I was trying to see if there was some way
to do away with trailing options altogether.  This might even extend to
things like:

qq:s"$interpolates @doesn't %doesn't"

And that's definitely a situation where it changes the parse.  Hmm, if
strings have options, they're probably addititive, so to add scalar
interpolation you'd want to base it on "q", not "qq":

q:s"$interpolates @doesn't %doesn't"

On the other hand, that doesn't work for the other things like "qr", so
maybe any of :s, :a, :h turn off default interpolations, so qr:a would
only interpolate arrays, for instance.

: # /pat/x  /pat/
: # /^pat$/m/^^pat$$/
: 
: That's...odd.  Is $$ (the variable) going away?

Maybe.  It'd be $*PID if so, since it's truly global to the process.
But if not, we could special case $$ inside regexes, just as we already
special case $ itself.

: # \p{prop}<+prop>  ???
: # \P{prop}<-prop>  ???
: 
: Intriguing.

Yeah, especially when you start stacking them.  But maybe we're treading
on [...] territory.  It could be argued that <...> is just a generalized
form of POSIX's [:...:] construct

: # \t  also 
: # \n  also  or  (latter matching
: logical newline)
: # \r  also 
: # \f  also 
: # \a  also 
: # \e  also 
: 
: I can tell you right now that these are going to screw people up.
: They'll try to use these in normal strings and be confused when it
: doesn't work.  And you probably won't be able to emit a warning,
: considering how much CGI Perl munches.

I can see pragmatic variants in which those *do* interpolate by default.
And pragmatic variants where they don't.

: # \033same
: # \x1Bsame
: # \x{263a}\x<263a> ???
: 
: Why?  Wouldn't we want the same thing to work in quoted strings?  (Or
: are those changing syntaxes too?)

I'm just wondering how far I can drive the principle that {} is always
a closure (even though it isn't).  I admit that it's probably overkill
here, which is why there are question marks.

: # \c[ same
: # \N{name}
: # \l  same
: # \u  same
: # \Lstring\E  \L
: # \Ustring\E  \U
: 
: So that's changed from whenever you talked about \q{} ?

Possibly.  Again, the question is whether {} more strongly imply
something that's not true.  But curlies were so overloaded in Perl 5
that I don't think people are going to necessarily expect them to do
only one thing.  Still, if <> are taking over the role of "unmarked
metasyntactic delimiters", maybe they belong here too.

: # \E  gone
: # [\040\t]\hplus any Unicode horizontal whitespace
: # [\r\n\ck]   \v  plus any Unicode vertical whitespace
: #=20
: # \b  same
: # \B  same
: 
: # \A  ^
: # \Z  same?
: # \z  $
: 
: Are you sure that optimizes for the common case?

No, I'm not sure, but we have to clean up the \A...\z mess somehow.

: # \G  , but assumed in nested patterns?
: # =20
: # \1  $1
: #=20
: # \Q$var\E$varalways assumed literal, so $1 is literal
: backref
: 
: So these are reinterpolated every time you backtrack?  Are you *trying*
: to destroy regex performance?  :^)

They're not interpolated.  They're matched, as in string comparison, just
as backrefs are matched right now.

: # $var<$var>  assumed to be regex
: 
: What if $var is a qr//ed object?

Then it's a pretty easy assumption that it's a regex.  :-)

: # =~ $re  =~ /<$re>/   ouch?
: 
: I don't see the win.

No difference if $re is qr//, but if it's not, that is the syntax for
forcing $re to be interpreted as a regex.

: # (??{$rule}) 
: # (?{ code }) { code } with failure semantics
: # (?#...) {"..."} :-)
: # (?:...) <:...>
: # (?=3D...)   
: # (?!...) 
: # (?<=3D...)  
: # (?
: 
: Cute.  (Wait a minute, aren't those reversed?)

Nope, I realized they were ambiguous depending on whether you think of
them as declarative or operational, but I settled on the declarative
reading because it works with their being assertions.  All the other
options I could think of are either really clunky or similarly ambigu

Re: Regex and Matched Delimiters

2002-04-23 Thread Larry Wall


Aaron Sherman writes:
: On Mon, 2002-04-22 at 21:53, Larry Wall wrote:
: 
: > * Parens always capture.
: > * Braces are always closures.
: > * Square brackets are always character classes.
: > * Angle brackets are always metasyntax (along with backslash).
: > 
: > So a first whack at the differences might be:
: [...]
: > space(or \h for "horizontal"?)
: > {n,m}   
: > 
: > \t  also 
: 
: I want to know how he does this!!

Could have something to do with the fact that I've been banging my head
against this for a couple of months already...

: We sit around scratching out heads
: looking for a syntax that fits and refines and he jumps in with
: something that redefines and simplifies. Larry is wasted on Perl. He
: needs to run for office ;-)

Agh, no!  I'm okay at simplifying, but I'm terrible at oversimplifying.

: > \Lstring\E  \L
: > \Ustring\E  \U
: 
: This one boggles me. Wouldn't that be something like:
: 
:  or string # ;-)

Well,  makes sense only if <> works in ordinary double quotes.

: Seriously, it seems that "\L" would be confusing.

Potentially, except that you almost never use it on anything but variable
interpoations.  So \L<$foo> would be a better example.  The confusing thing
is that $foo would not be assumed to be a regular expression, whereas it
would in bare <$foo> (at least in a regex).

: > \Q$var\E$varalways assumed literal, so $1 is literal 
:backref
: > $var<$var>  assumed to be regex
: 
: Very nice. I can get behind this, and a lot of people will thank you who
: have to maintain code.

Well, almost anything is an improvement over the current syntax.

: > =~ $re  =~ /<$re>/   ouch?
: 
: If $re is a regexp, wouldn't "$str =~ $re" turn into "$re.match($str)"?
: Perhaps "$re.m $str" which is no more typing and pretty clear to me.

Sure, but I was illustrating the situation of a non-qr string being
forced to be a regex.

: > Obviously the  and  syntaxes will be user extensible.
: > We have to be able to support full grammars.  I consider it a feature
: > that  looks like a non-terminal in standard BNF notation.  I do
: > not consider it a misfeature that  resembles an HTML or XML tag,
: > since most of those languages need to be matched with a fancy rule
: > named  anyway.
: 
: It's too bad that  would be messy with standard Perl //-enclosed
: regexes, as it would be a nice way to pass parameters to user-defined
: tags. It would also allow XML-like propagation of results:
: 
:   xyz

Gee, maybe we could make a way for people to use alternate dilimiters
like they've always done with s///.  :-)

Larry

RE: Regex and Matched Delimiters

2002-04-23 Thread Brent Dax


Sorry to reply to the same message twice, but I just noticed something.

Larry Wall:
# {n,m} 

Isn't that the only use of angle brackets as a quantifier?  That's going
to make parsing more difficult...

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

#define private public
--Spotted in a C++ program just before a #include

Re: Please rename 'but' to 'has'.

2002-04-23 Thread ggermain


In reply to Buddha Buck <[EMAIL PROTECTED]>:

> At 08:58 AM 04-23-2002 -0700, Larry Wall wrote:
> >Precedence is set with the "like' property:
> >
> > my sub operator:now ($a,$b) is like("but") is inline { $a but $b
> }
> > sub operator:also ($a,$b) is like("and") is inline { $a and $b }
> 
> OK, but that limits you to the, um, 24 standard levels of precedence. 
> What 
> do you do if you don't think that that's enough.  Let's say you want to
> 
> define a "nand" operator:
> 
> my sub operator:nand ($a, $b) is inline { not ($a and $b) }
> 
> but you want nand to have a precedence lower than the existing 'and' but
> 
> higher than the existing 'or' (for some reason I can't imagine 
> offhand).  It isn't like() anything, since there isn't anything
> currently 
> between 'and' and 'or'.  Would that be something like:
> 
> my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b)
> }
> 

24 levels of precedence should be enough, else you can always resort to parens.

Guillaume

Re: Regex and Matched Delimiters

2002-04-23 Thread Larry Wall


Me writes:
: > /pat/i m:i/pat/ or // or even m ???
: 
: Why lose the modifier-following-final-delimiter
: syntax? Is this to avoid a parsing issue, or
: because it's linguistically odd to have a modifier
: at the end?

Haven't decided for sure to lose it, but it does have several problems.
First is the parsing issue, but there's also what in natural language
is called the "end weight" problem.  We often rearrange our sentences
in English so that the short things come first and the long things come
last.  That's why you choose indirect object syntax sometimes and not
others.  Try turning either of these to the other form:

I gave him a big, smelly tuna-fish and cucumber sandwich.
I gave the sandwich to a big, smelly tuna fisherman and his dog "Cucumber".

Now, options are always little, so it seems that they should come early.

: > /^pat$/m /^^pat$$/
: 
: What's the mnemonic here? It feels the wrong
: way round -- like a single ^ or $ should match
: at newlines, double ^ or $ should only match
: at start/end string.

Well, I though of it as ^^ or $$ matching potentially multiple places
in the string.

: Ah. The newline matches between the ^^ or $$.
: That works.

Except that the newline doesn't match between the characters.  You could
say /$$\n^^/ for instance.

: Then there's the PID issue. Hmm. How to save $$
: (it is nice for one liners)?

$PID is only two chars worse.  (The * of $*PID is optional.)

: Sorry if this is a dumb suggestion, but could you have
: just one assertion, say ^$, that alternates matching
: just before and just after a newline?

^$ matches a null string.  That aside, I don't think stateful assertions
would be unconfusing in the extreme.

: > /./s // or /<.>/ ???
: 
: I'd expect . to match newlines by default. For a . that
: didn't match newlines, I'd expect to need to use [^\n].

But . has never matched newlines by default, not even in grep.  Possibly
some editors do it that way, but if so, it's non-standard.

: > space  (or \h for "horizontal"?)
: 
: Can one quote a substring of a regex? In a later part you
: say that \Q...\E is going away, so it seems not. It would be
: nice to say something like:
: 
: /foo bar baz 'qux waldo' emerson/
: 
: and have the space between qux and waldo be literal.
: Similar arguments apply more broadly so that one
: could escape the usual meaning of metacharacters etc.

Well, <"qux waldo"> could be made to mean that, I suppose.  For that
matter, so might \q{qux waldo}.  Er, \q?

: > \Lstring\E \L
: > \Ustring\E \U
: 
: Maybe, if I wasn't too far off with the quote mark
: suggestion above, then  \L'string' would be more
: natural.

Maybe \L and \q are in the same class, in which case that would work.

: > (?#...) {"..."} :-)
: 
: Will plain # comments work in p6  regexen?

Yes, just as in /x.  And there's no ambiguity in the end delimiter
any more because we parse in one pass.

: > (?:...) <:...>
: > (?=...) 
: > (?!...) 
: > (?<=...) 
: > (?
: > (?>...) 
: 
: Hmm. So <> are clustering just like ().

Yes, and you can quantify them where it makes sense.

: One difference is that () always capture whereas <>
: only do so sometimes. Oh, and {} can too.

Eh?  <> never capture.  None of those constructs above capture.
Nothing inside a {} can capture anything that influences the paren
count outsid the {}, because any inner regex has its own paren count.

: () are no longer used for clever stuff, <> are instead.
: And {}.

Basically, yes.

: Hmm. Time for bed.

Why?  I just got up.  :-)

Larry

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Buddha Buck

At 01:12 PM 04-23-2002 -0400, [EMAIL PROTECTED] wrote:

>24 levels of precedence should be enough, else you can always resort to 
>parens.

I would have agreed, except that I would have also said that the 14 
precedence levels of C should be enough as well -- yet we seem to have 
discovered uses for 10 more.

>Guillaume

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Larry Wall


Buddha Buck writes:
: At 08:58 AM 04-23-2002 -0700, Larry Wall wrote:
: >Precedence is set with the "like' property:
: >
: > my sub operator:now ($a,$b) is like("but") is inline { $a but $b }
: > sub operator:also ($a,$b) is like("and") is inline { $a and $b }
: 
: OK, but that limits you to the, um, 24 standard levels of precedence.  What 
: do you do if you don't think that that's enough.  Let's say you want to 
: define a "nand" operator:
: 
: my sub operator:nand ($a, $b) is inline { not ($a and $b) }
: 
: but you want nand to have a precedence lower than the existing 'and' but 
: higher than the existing 'or' (for some reason I can't imagine 
: offhand).  It isn't like() anything, since there isn't anything currently 
: between 'and' and 'or'.  Would that be something like:
: 
: my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b) }

Yes, that's what I was thinking.  And the dimensions shrink every time
you do that, so if something is "above" your C, it doesn't go
back to being the same as C.

Though since people can't seem to keep up and down straight on their
precedence charts, I'd go for "tighter" and "looser" or some such.  I
think I'm even on the record somewhere about that.

Larry

Re: Please rename 'but' to 'has'.

2002-04-23 Thread Dan Sugalski


At 12:36 PM -0400 4/23/02, Buddha Buck wrote:
>At 08:58 AM 04-23-2002 -0700, Larry Wall wrote:
>>Precedence is set with the "like' property:
>>
>> my sub operator:now ($a,$b) is like("but") is inline { $a but $b }
>> sub operator:also ($a,$b) is like("and") is inline { $a and $b }
>
>OK, but that limits you to the, um, 24 standard levels of 
>precedence.  What do you do if you don't think that that's enough

Internally precedence is going to be stored as a floating-point 
number. Dunno how it'll be exposed at the language level, but at 
least there'll be more than just 20 or so levels.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Regex and Matched Delimiters

2002-04-23 Thread Larry Wall


Brent Dax writes:
: Sorry to reply to the same message twice, but I just noticed something.
: 
: Larry Wall:
: # {n,m}   
: 
: Isn't that the only use of angle brackets as a quantifier?  That's going
: to make parsing more difficult...

How so?  It's just a one-character lookahead to see if it's a digit.

But we could actually use a more general syntax:



Larry

[PATCH] Re: Arrays of PMCs

2002-04-23 Thread Steve Fink

On Mon, Apr 22, 2002 at 05:40:09PM +0100, Piers Cawley wrote:
> Does anyone have an idea of when we're going to see these? Or hashes
> of PMCs, I don't really care which...

Well, we don't have hashes of anything. We already have arrays of
PMCs. You just can't get the PMCs out, only their integer or numeric
values. :-)

True arrays of PMCs should probably be blocked on the whole keyed
thing. Keyed access is partially implemented, but there's way too much
manual code repetition at the moment, and the assembly syntax is
wrong:

EVENTUALCURRENT
set I0, P0[7]   get_keyed I0, P0, 7
set P0[7], I0   set_keyed P0, 7, I0
set P0[0], P1[1]not possible
set I0, P0[P1]  not possible -- I'm not even sure what this will do
set P1, P0[7]   get_keyed P1, P0, 7 (requires the recently committed patch)
set P0[7], P1   set_keyed P0, 7, P1 (requires the recently committed patch)

So far, I've just kind of thrown in more and more [sg]et_keyed
variants as they were needed. To continue in this grand tradition,
I've just committed a patch to allow getting and setting of the PMCs
in arrays. However, I'm not really sure how 'set P0[7], P1' is
supposed to behave. I just overwrite the whole P0[7] array slot,
discarding the previously held PMC. I don't remember the whole 'set
P0, P1' discussion well enough to venture an opinion on whether the
previous occupant gets to have a say in what happens.

This code now works:

# P0 is initialized to an array containing the command-line arguments
new P1, PerlArray
set_keyed P1, 0, P0  # set P1[0], P0
get_keyed P31, P1, 0 # set P31, P1[0]
get_keyed S0, P31, 0 # set S0, P31[0]
print "Command name: "
print S0
print "\n"
end

Re: [PATCH] Re: Arrays of PMCs

2002-04-23 Thread Steve Fink


Oops, forgot to change the subject line. No patch. Patch already
committed.

Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5

2002-04-23 Thread Dan Sugalski


At 12:25 PM +0200 4/19/02, Peter Gibbs wrote:
>Mike Lambert wrote:
>>  Undoing the patch in resources.c seems to fix the problem.
>>
>>  Changing:
>>  ((Buffer *)buffer)->buflen = req_size;
>>  to:
>>  ((Buffer *)buffer)->buflen = size;
>>  makes it work again.
>
>Just for interest, the problem here is that the rounding is always up to the
>next multiple of 16. So, for example, a zero-length string would have buflen
>set to 16 (actually it is set back to zero in string_make, but that just
>slows the process down slightly); string_copy would ask for a buffer of 16
>and get back a buffer of 32, etc, so every time a string is copied, it grows
>by 16 bytes.

That's not true. Since we're copying and allocating based on the 
original length, we're not going to grow.

However, the point is well-taken--having a version that allocates and 
returns the real length in the buffer's useful for strings. I'm going 
to add one in a minute here.

>This effect is exacerbated by the fact that "set S1, S2" does a
>string_copy - I am still not sure what is supposed to happen here; I believe
>that the pure set opcode should just be doing a register copy?? There is a
>clone opcode which also does a string_copy, which seems reasonable.

set S0, S1 is broken. I'm fixing that now.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Using Parrot

2002-04-23 Thread Dan Sugalski


At 6:19 PM +0100 4/19/02, Alberto Manuel Brandão Simões wrote:
>But, this e-mail is not to say this, but to request some kind of help.
>I'm used to check-out, compile and test parrot, looked at the language
>(well, a long time ago) and I'm needing to look to it again. The
>question is, what documents do you think I should read to start quickly
>using Parrot? PDD's, any pod from Parrot cvs tree... any other thing?

Did you get sufficient information? If not, where have we left gaps?

FWIW, i'd love to be kept updated on the state of your project. A
link to it from the parrotcode.org website would also be in order, I
think. (I need to pass on info on Cardinal, the Ruby on Parrot
project, soon)
--
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: goto ADDRESS()

2002-04-23 Thread Dan Sugalski

At 7:42 PM +0200 4/19/02, Marco Baringer wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>>  Ah, this is incorrect. goto ADDRESS should go to an absolute address,
>>  period. It's for use in those times when you *have* an absolute
>>  address--for example when you've just fetched the address of a
>>  subroutine from a symbol table.
>
>but what do i put in the symbol table?

The absolute address.

>every address i have in the
>symbol is, unless my understanding is severly flawed, determined at
>compile time by the assembler and is relative to the start of the byte
>code.

The addresses are always absolute, and are determined at load or 
runtime. Branches are relative, and can be determined at compile time.

>if i have code like (perl5 syntax to avoid confusion)
>
>my $f = sub { print "hello" };
>$f->();
>
>i imagine that will become more or less: (in pseudo pasm)
>
>closure_000:
> print "hello"
> ret
>
>main:
> set_sym P0, '$f', [closure_000]
> fetch_sym I0, P0, '$f'
> jsr I0

Something like that. But in this case, the set_sym (which'll probably 
be 'make_closure' or something) will store the absolute address into 
the symbol table or PMC in P0.

>  > Jumping from the start of the
>>  bytecode segment is an interesting idea, but since it's only valid
>>  when used to transfer control from within the current segment, you
>>  might as well just use goto OFFSET instead.
>
>sorry, but i don't understand.

There's no real functional difference between "offset from current 
spot" and "offset from beginning of segment", so I'm unwilling to 
have two separate relative addressing modes.

>  > --
>
>all i want to be able to do is:
>
>set I0, [whatever]
>
>jsr I0
>
>or
>
>set I0, [wherever]
>
>jump I0
>
>and as far as i know i can't currently do this.

Right. There's currently no way to get the absolute address of a 
label. That should be fixed--in fact, it will be as soon as the 
current run of tests I've got going are done and checked in. (Okay, 
after they're done since they're for other things, but...)

it really looks like you're trying to work around a deficiency in the 
current scheme. Better to fix the deficiency. :)

>post scriptum - did my patch to lib/Parrot/Assembler get lost in the haze or
>was there something wrong with it?

If it's not in, it got lost in the haze. Which assembler was it against?0

>post post scriptum - i noticed a mention of #parrot in some email,
>which network is that on?

irc.rhizomatic.net.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: [PATCH] Revised TODO list, again

2002-04-23 Thread Dan Sugalski


At 1:10 PM -0700 4/19/02, Steve Fink wrote:
>This one got dropped too, and maybe this isn't the right place for
>this anymore.

Applied. Sorry for the wait.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5

2002-04-23 Thread Simon Glover



On Tue, 23 Apr 2002, Dan Sugalski wrote:

> At 12:25 PM +0200 4/19/02, Peter Gibbs wrote:
> >Mike Lambert wrote:
>
> >This effect is exacerbated by the fact that "set S1, S2" does a
> >string_copy - I am still not sure what is supposed to happen here; I believe
> >that the pure set opcode should just be doing a register copy?? There is a
> >clone opcode which also does a string_copy, which seems reasonable.
>
> set S0, S1 is broken. I'm fixing that now.

 And here's a test for that. (By the way, is there any way to test it more
 directly?)

 Simon

--- t/op/string.t.old   Tue Apr 23 15:42:36 2002
+++ t/op/string.t   Tue Apr 23 15:49:40 2002
@@ -1,6 +1,6 @@
 #! perl -w

-use Parrot::Test tests => 76;
+use Parrot::Test tests => 77;

 output_is( <<'CODE', <

Mutable vs immutable strings

2002-04-23 Thread Dan Sugalski


Okay folks, time to hash this out once and for all.

Should strings in parrot be mutable or immutable? Right now we've a 
mix, and that's untenable.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

RE: Mutable vs immutable strings

2002-04-23 Thread Brent Dax


Dan Sugalski:
# Okay folks, time to hash this out once and for all.
# 
# Should strings in parrot be mutable or immutable? Right now we've a 
# mix, and that's untenable.

Three questions:

1. Which'll be faster?
2. Which'll be simpler?
3. Which is more important?

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

#define private public
--Spotted in a C++ program just before a #include

Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5[APPLIED]

2002-04-23 Thread Dan Sugalski


At 3:55 PM -0400 4/23/02, Simon Glover wrote:
>On Tue, 23 Apr 2002, Dan Sugalski wrote:
>
>>  At 12:25 PM +0200 4/19/02, Peter Gibbs wrote:
>>  >Mike Lambert wrote:
>>
>>  >This effect is exacerbated by the fact that "set S1, S2" does a
>>  >string_copy - I am still not sure what is supposed to happen 
>>here; I believe
>>  >that the pure set opcode should just be doing a register copy?? There is a
>>  >clone opcode which also does a string_copy, which seems reasonable.
>>
>>  set S0, S1 is broken. I'm fixing that now.
>
>  And here's a test for that. (By the way, is there any way to test it more
>  directly?)

Applied, thanks. (And no, not at the moment)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

[PATCH] Fix Read with new allocate_about

2002-04-23 Thread Mike Lambert


This should hopefully fix a problem Clint noticed with his LOAD bug,
assuming he is using this op. The code was assuming that a string_make's
passed len==buflen, which is no longer the case.

Mike Lambert

Index: core.ops
===
RCS file: /cvs/public/parrot/core.ops,v
retrieving revision 1.126
diff -r1.126 core.ops
370c370
<   s->bufused = s->buflen;
---
>   s->bufused = len;

Re: [PATCH] Fix Read with new allocate_about [APPLIED]

2002-04-23 Thread Dan Sugalski


At 4:54 PM -0400 4/23/02, Mike Lambert wrote:
>This should hopefully fix a problem Clint noticed with his LOAD bug,
>assuming he is using this op. The code was assuming that a string_make's
>passed len==buflen, which is no longer the case.

Applied, thanks. (BTW, could you use either -p or -u for patches? 
Makes patch happier)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: [PATCH] Make obscure.ops work

2002-04-23 Thread Dan Sugalski


At 7:51 PM -0700 4/20/02, Chip Salzenberg wrote:
>I realize that obscure.ops isn't a big deal, but why not make
>it work?  Thus this patch.  This patch eliminates the versions
>of the ops that accept integers, under the assumption that trig
>on integers is extraordinarily silly.

Applied, thanks.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Regex and Matched Delimiters

2002-04-23 Thread Aaron Sherman

On Tue, 2002-04-23 at 12:48, Larry Wall wrote:
> Brent Dax writes:

> : # \talso 
> : # \nalso  or  (latter matching
> : logical newline)
> : # \ralso 
> : # \falso 
> : # \aalso 
> : # \ealso 
> : 
> : I can tell you right now that these are going to screw people up.
> : They'll try to use these in normal strings and be confused when it
> : doesn't work.  And you probably won't be able to emit a warning,
> : considering how much CGI Perl munches.
> 
> I can see pragmatic variants in which those *do* interpolate by default.
> And pragmatic variants where they don't.

If you put them in one, put them in the other, HOWEVER, there's a strong
pragmatic reason for neither that i can see.

HTML/XML/SGML

I hate to say it, but if <> interpolates in everything cleanly with no
overloading, the *ML camps will thank you deeply. How often I've
written:

qq{$content}

I cannot tell you, but it's large.

Why not use {} for this and add an {eval:code}?

> I'm just wondering how far I can drive the principle that {} is always
> a closure (even though it isn't).  I admit that it's probably overkill
> here, which is why there are question marks.

I like the idea, but I don't think it fits. On the other hand, if inside
all interpolating operators {} is the special thing that gets
interpolated (and NOTHING else), I could see liking the new look:

qq{a${x}b}  => qq{a{$x}b}
qr{a\Q${x}\Eb$} => qr{a{q:$x}b$}
qr{a${x}b$} => qr{a{$x}b$}
q{a}.eval($x).q{b}  => qq{a{e:$x}b} or qq{a{{$x}}b}
"ajs\@ajs.com"  => qq{[EMAIL PROTECTED]}
"ajs". @{ajs} .".com"   => qq{ajs{@ajs}.com}

I know it's a departure from your original idea, but it certainly
unifies the syntax nicely:

qq{Hello, World!{nl}}
qr{Hello, World!{nl}}

> With respect to Perl 5, I'm trying to unhijack curlies as much as possible.

Ooops :-)

Re: Mutable vs immutable strings

2002-04-23 Thread Andrew J Bromage

G'day all.

On Tue, Apr 23, 2002 at 01:18:23PM -0700, Brent Dax wrote:

> Three questions:
> 
> 1. Which'll be faster?

It depends on the application, but my money is on mutable strings
built on top of an immutable buffer.  That's based on looking at my
own string-based Perl code, a lot of which is substring extraction
(usually by regular expression).  It may pay off if a string and its
substrings can share implementation.

> 2. Which'll be simpler?

Immutable strings are definitely simpler, when you have garbage
collection.

> 3. Which is more important?

At the risk of stating the obvious, it is more important for the
interface to be complete.

Cheers,
Andrew Bromage

RE: Mutable vs immutable strings

2002-04-23 Thread Brent Dax


Andrew J Bromage:
# On Tue, Apr 23, 2002 at 01:18:23PM -0700, Brent Dax wrote:
# 
# > Three questions:
# > 
# > 1. Which'll be faster?
# 
# It depends on the application, but my money is on mutable 
# strings built on top of an immutable buffer.  That's based on 
# looking at my own string-based Perl code, a lot of which is 
# substring extraction (usually by regular expression).  It may 
# pay off if a string and its substrings can share implementation.

That's what I thought.

# > 2. Which'll be simpler?
# 
# Immutable strings are definitely simpler, when you have 
# garbage collection.
# 
# > 3. Which is more important?
# 
# At the risk of stating the obvious, it is more important for 
# the interface to be complete.

The interface can be complete either way.  It's how fast the code behind
the interface is that'll vary.

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

#define private public
--Spotted in a C++ program just before a #include

[PATCH] Remove prederef's reliance on shared libraries

2002-04-23 Thread Steve Fink


This is a rather clumsy patch to make prederef mode work without
needing to be compiled as a shared library. In fact, it prevents it
from being used as a shared library (but it's trivial to revert to the
former behavior; see the patch.)

Anyone who wishes is welcome to figure out exactly what's going on
with shared oplibs. I believe prederef mode contains a valiant start
at them, but at the moment the inability to compare all three
different modes of operation with a single binary is just an
annoyance. (And that 3 should really be 4; the computed goto should
just be another option IMHO.)

Index: config_h.in
===
RCS file: /home/perlcvs/parrot/config_h.in,v
retrieving revision 1.26
diff -u -r1.26 config_h.in
--- config_h.in 22 Mar 2002 18:06:46 -  1.26
+++ config_h.in 24 Apr 2002 03:32:37 -
@@ -55,6 +55,7 @@
 
 #define PARROT_CORE_OPLIB_NAME "core"
 #define PARROT_CORE_OPLIB_INIT Parrot_DynOp_core_${MAJOR}_${MINOR}_${PATCH}
+#define PARROT_CORE_PREDEREF_OPLIB_INIT 
+Parrot_DynOp_core_prederef_${MAJOR}_${MINOR}_${PATCH}
 
 #define INTVAL_FMT "${intvalfmt}"
 #define FLOATVAL_FMT "${floatvalfmt}"
Index: interpreter.c
===
RCS file: /home/perlcvs/parrot/interpreter.c,v
retrieving revision 1.84
diff -u -r1.84 interpreter.c
--- interpreter.c   15 Apr 2002 18:05:18 -  1.84
+++ interpreter.c   24 Apr 2002 03:32:38 -
@@ -104,7 +104,7 @@
 static op_func_t *prederef_op_func = NULL;
 
 static void
-init_prederef(struct Parrot_Interp *interpreter)
+init_prederef(struct Parrot_Interp *interpreter, BOOLVAL dynamic)
 {
 char file_name[50];
 char func_name[50];
@@ -122,9 +122,9 @@
  * Get a handle to the library file:
  */
 
-prederef_oplib_handle = Parrot_dlopen(file_name);
+if (dynamic) prederef_oplib_handle = Parrot_dlopen(file_name);
 
-if (!prederef_oplib_handle) {
+if (dynamic && !prederef_oplib_handle) {
 internal_exception(PREDEREF_LOAD_ERROR,
"Unable to dynamically load oplib file '%s' for oplib 
'%s_prederef' version %s!\n",
file_name, PARROT_CORE_OPLIB_NAME, PARROT_VERSION);
@@ -134,9 +134,14 @@
  * Look up the init function:
  */
 
-prederef_oplib_init =
-(oplib_init_f)(ptrcast_t)Parrot_dlsym(prederef_oplib_handle,
-  func_name);
+if (dynamic) {
+prederef_oplib_init =
+(oplib_init_f)(ptrcast_t)Parrot_dlsym(prederef_oplib_handle,
+  func_name);
+} else {
+extern op_lib_t * PARROT_CORE_PREDEREF_OPLIB_INIT(void);
+prederef_oplib_init = PARROT_CORE_PREDEREF_OPLIB_INIT;
+}
 
 if (!prederef_oplib_init) {
 internal_exception(PREDEREF_LOAD_ERROR,
@@ -202,13 +207,13 @@
  */
 
 static void
-stop_prederef(void)
+stop_prederef(BOOLVAL dynamic)
 {
 prederef_op_func = NULL;
 prederef_op_info = NULL;
 prederef_op_count = 0;
 
-Parrot_dlclose(prederef_oplib_handle);
+if (dynamic) Parrot_dlclose(prederef_oplib_handle);
 
 prederef_oplib = NULL;
 prederef_oplib_init = (oplib_init_f)NULLfunc;
@@ -371,7 +376,7 @@
 
 code_start_prederef = pc_prederef;
 
-init_prederef(interpreter);
+init_prederef(interpreter, 0);
 
 while (pc_prederef) {
 pc_prederef =
@@ -379,7 +384,7 @@
interpreter);
 }
 
-stop_prederef();
+stop_prederef(0);
 
 if (pc_prederef == 0) {
 pc = 0;
Index: docs/running.pod
===
RCS file: /home/perlcvs/parrot/docs/running.pod,v
retrieving revision 1.7
diff -u -r1.7 running.pod
--- docs/running.pod25 Mar 2002 18:41:51 -  1.7
+++ docs/running.pod24 Apr 2002 03:32:38 -
@@ -27,8 +27,12 @@
 That's because we use fixed address for registers, this problem will
 be solved soon.
 
-Prederef mode only works as a shared library. For example, on most
-Unix platforms:
+Prederef mode should work for all programs.
+
+It previously only worked as a shared library. To revert to that
+state, find the calls to init_prederef and stop_prederef in
+interpreter.c, and pass a true value instead of zero as the sole
+argument. Then, on most Unix platforms:
 
   make clean
   make shared

Another [PATCH]: allow deactivating computed goto

2002-04-23 Thread Steve Fink


On Tue, Apr 23, 2002 at 08:54:56PM -0700, Steve Fink wrote:
> (And that 3 should really be 4; the computed goto should > just be
> another option IMHO.)

Maybe not so humble: here's a patch to disable the default computed
goto core, so you can compare all four cores (assuming the previous
patch is applied.)

One weirdness I encountered:

 #define setopt(flag) Parrot_setflag(interpreter, flag, (*argv)[0]+2);

What the heck does this do? Parrot_setflag uses its 3rd argument only
as a boolean value. Where this is used, argv[0] always contains the
current command-line argument. So this is equivalent to 

  argv[0][0]+2

or in the example of "-p", that would be the character '-' + 2. Now,
to make that do something, you'd need the first character of the
option to be -2, and that's some weird hi-bit character. Huh?

Index: include/parrot/interpreter.h
===
RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v
retrieving revision 1.40
diff -u -r1.40 interpreter.h
--- include/parrot/interpreter.h3 Apr 2002 04:01:41 -   1.40
+++ include/parrot/interpreter.h24 Apr 2002 03:58:02 -
@@ -23,7 +23,8 @@
 PARROT_BOUNDS_FLAG   = 0x04,  /* We're tracking byte code bounds */
 PARROT_PROFILE_FLAG  = 0x08,  /* We're gathering profile information */
 PARROT_PREDEREF_FLAG = 0x10,  /* We're using the prederef runops */
-PARROT_JIT_FLAG  = 0x20   /* We're using the jit runops */
+PARROT_JIT_FLAG  = 0x20,  /* We're using the jit runops */
+PARROT_CGOTO_FLAG= 0x40   /* We're using the computed goto runops */
 } Interp_flags;
 
 #define Interp_flags_SET(interp, flag)   (/*@i1@*/ (interp)->flags |= (flag))
Index: include/parrot/runops_cores.h
===
RCS file: /home/perlcvs/parrot/include/parrot/runops_cores.h,v
retrieving revision 1.4
diff -u -r1.4 runops_cores.h
--- include/parrot/runops_cores.h   4 Mar 2002 03:17:21 -   1.4
+++ include/parrot/runops_cores.h   24 Apr 2002 03:58:03 -
@@ -20,6 +20,8 @@
 
 opcode_t *runops_fast_core(struct Parrot_Interp *, opcode_t *);
 
+opcode_t *runops_cgoto_core(struct Parrot_Interp *, opcode_t *);
+
 opcode_t *runops_slow_core(struct Parrot_Interp *, opcode_t *);
 
 #endif
Index: interpreter.c
===
RCS file: /home/perlcvs/parrot/interpreter.c,v
retrieving revision 1.84
diff -u -r1.84 interpreter.c
--- interpreter.c   15 Apr 2002 18:05:18 -  1.84
+++ interpreter.c   24 Apr 2002 03:58:04 -
@@ -420,7 +425,12 @@
 which |= (Interp_flags_TEST(interpreter, PARROT_PROFILE_FLAG)) ? 0x02 : 0x00;
 which |= (Interp_flags_TEST(interpreter, PARROT_TRACE_FLAG))   ? 0x04 : 0x00;
 
-core = which ? runops_slow_core : runops_fast_core;
+if (which)
+core = runops_slow_core;
+else if (Interp_flags_TEST(interpreter, PARROT_CGOTO_FLAG))
+core = runops_cgoto_core;
+else
+core = runops_fast_core;
 
 if (Interp_flags_TEST(interpreter, PARROT_PROFILE_FLAG)) {
 unsigned int i;
Index: runops_cores.c
===
RCS file: /home/perlcvs/parrot/runops_cores.c,v
retrieving revision 1.17
diff -u -r1.17 runops_cores.c
--- runops_cores.c  17 Apr 2002 03:50:25 -  1.17
+++ runops_cores.c  24 Apr 2002 03:58:04 -
@@ -30,12 +30,29 @@
 opcode_t *
 runops_fast_core(struct Parrot_Interp *interpreter, opcode_t *pc)
 {
-#ifdef HAVE_COMPUTED_GOTO
-pc = cg_core(pc, interpreter);
-#else
 while (pc) {
 DO_OP(pc, interpreter);
 }
+return pc;
+}
+
+/*=for api interpreter runops_cgoto_core
+ * run parrot operations until the program is complete, using the computed
+ * goto core (if available).
+ *
+ * No bounds checking.
+ * No profiling.
+ * No tracing.
+ */
+
+opcode_t *
+runops_cgoto_core(struct Parrot_Interp *interpreter, opcode_t *pc)
+{
+#ifdef HAVE_COMPUTED_GOTO
+pc = cg_core(pc, interpreter);
+#else
+fprintf(stderr, "Computed goto unavailable in this configuration.\n");
+exit(1);
 #endif
 return pc;
 }
Index: test_main.c
===
RCS file: /home/perlcvs/parrot/test_main.c,v
retrieving revision 1.50
diff -u -r1.50 test_main.c
--- test_main.c 26 Mar 2002 16:33:01 -  1.50
+++ test_main.c 24 Apr 2002 04:01:25 -
@@ -14,6 +14,7 @@
 #include 
 
 #define setopt(flag) Parrot_setflag(interpreter, flag, (*argv)[0]+2);
+#define unsetopt(flag) Parrot_setflag(interpreter, flag, 0)
 
 char *parseflags(Parrot interpreter, int *argc, char **argv[]);
 
@@ -62,6 +63,10 @@
 (*argc)--;
 (*argv)++;
 
+#ifdef HAVE_COMPUTED_GOTO
+setopt(PARROT_CGOTO_FLAG);
+#endif
+
 while ((*argc) && (*argv)[0][0] == '-') {
 switch ((*argv)[0][1]) {
 case 'b':
@@ -76,6 +81

Using closures for regex control

2002-04-23 Thread Me


Larry said:
> I haven't decided yet whether matches embedded in
> [a regex embedded] closure should automatically pick
> up where the outer match is, or whether there should
> be some explicit match op to mean that, much like \G
> only better. I'm thinking when the current topic is a
> match state, we automatically continue where we left
> off, and require explicit =~ to start an unrelated match.

So, this might DWIM:

# match pat1 _ pat2 _ pat3 and capture pat2 match:
/ pat1
  { ($foo) = / pat2 / }
  pat3 /

What is the meaning of a string returned by some code
inside a regex? Would this DWIM:

# match pat1 _ 'foo bar' _ pat2:
/ pat1 # white space is ignored
  { return 'foo bar' } # conserve whitespace
  pat2 /

What if there were methods on the match state to
achieve regex extensions:

s/ { .<; /c/ } ei / ie /; # wierd look behind?

and so on:

/ pat1 { .>; /pat2/ } pat3 /
/ { .! and .<; /pat1/ } pat2 } /

--
ralph

Re: Regex and Matched Delimiters

2002-04-23 Thread Me


> : I'd expect . to match newlines by default. For a . that
> : didn't match newlines, I'd expect to need to use [^\n].
> 
> But . has never matched newlines by default, not even in grep.

Perhaps. But:

First, I would have thought you *can't* make . match newlines
in grep, period. If so, then when perl is handling a multi-line
string, it is handling a case grep never encounters.

Second, I think the perl 5 default is the wrong one from the
point of view of a typical newbie's guess.

Third, I was thinking that having perl 6 regexen have /s on
by default would be easy for perl 5 coders to understand;
not too hard to get used to; and have no negative effects
for existing coders beyond getting used to the change.

--
ralph

Re: Regex and Matched Delimiters

2002-04-23 Thread Me


> > : I'd expect . to match newlines by default.

I forgot, fourth, this simplifies the rule for . -- it
would become period matches any char, period.

Fifth, it makes the writing of "match anything but
newline" into an explicit [^\n], which I consider a
good thing.

Of course, all this is minor stuff. But I can't get
my head around parse trees and grammars, so
I'll continue to fiddle around spraying a bit of
grafitti here and there on the bikeshed.

--
ralph

Re: Regex and Matched Delimiters

2002-04-23 Thread Michael G Schwern

On Tue, Apr 23, 2002 at 11:11:28PM -0500, Me wrote:
> Third, I was thinking that having perl 6 regexen have /s on
> by default would be easy for perl 5 coders to understand;
> not too hard to get used to; and have no negative effects
> for existing coders beyond getting used to the change.

I'm jumping in the middle of a conversation here, but consider the
problem of .* matching newlines by default and greediness.

   /(foo.*)$/,  /(foo.*)$/m  and  /(foo.*)$/s

when matching against something like "foo\nwiffle\nbarfoo\n" One matches the
last line.  One matches the first line.  And one matches all three lines.

-- 

Michael G. Schwern   <[EMAIL PROTECTED]>http://www.pobox.com/~schwern/
Perl Quality Assurance  <[EMAIL PROTECTED]> Kwalitee Is Job One
Consistency?  I'm sorry, Sir, but you obviously chose the wrong door.
-- Jarkko Hietaniemi in <[EMAIL PROTECTED]>

Re: Regex and Matched Delimiters

2002-04-23 Thread Me


> when matching against something like "foo\nwiffle\nbarfoo\n"


>/(foo.*)$/ # matches the last line

/(foo[^\n]*)$/ # assuming perl 6 meaning of $, end of string


>/(foo.*)$/m # matches the first line

/(foo[^\n]*)$$/ # assuming perl 6 meaning of $$, end of line

or

/(foo.*?)$$/


>/(foo.*)$/s # matches all three lines

/(foo.*)$/


--
ralph

[CONFIGURE] New make.pl coming soon...

2002-04-23 Thread Jeff


In between attempting to get the new assembler up and running (currently
dealing with XS issues), Robert Spiers and I have come up with a new
make mechanism.

The syntax may change, and the build mechanism has a ways to go (It's
simply running one step at a time in order, no parallelism or multiple
processes), but the basic idea so far seems sound.

Our replacement for the somewhat non-portable make mechanism relies
entirely on perl, and so far, no external modules are required. The
current version makes allowances for Win32 issues, but has not been
tested on Win32 yet.

Make.pl is a simple perl script that builds a dependency graph and
satisfies a single target recursively. The Makefile it seeks to emulate
follows:

--cut here--
foo.o: foo.c
cc -c foo.c
bar.o: bar.c
cc -c bar.c
foo: foo.o bar.o
cc -o foo foo.o bar.o
--cut here--

This, of course, builds the binary 'foo' from the source files 'foo.c'
and 'bar.c'.

So far, with the exception of the need to explicitly declare a target,
it's a one-to-one translation of the makefile.

The perl translation of the Makefile above follows (with comments):

--cut here--

# Create a compile target for 'foo.o'. "Compile targets" are actually
objects
# that can be queried to determine if they've been completed or not, and
asked
# how they want to be built.

# Although the compile target is named 'CC' this, of course, doesn't
mean that
# 'cc' is used. The compiler name and arguments are determined based on
the
# current platform.

my $foo_o = CC(

  # The 'Object()' syntax here simply takes into account the fact that
  # platforms such as Win32 require different extensions than UNIX.
  # The target is currently declared explicitly, but given the fact that
we
  # already have the input file name, it would be easy to determine the
  # output file name should we want to declare 'output' explicitly.

  output => Object( input => 'foo' ),

  # The input is given an explicit file extension to accommodate for the
fact
  # that the user may want to submit a .C file or .ch file to the
compiler.
  # Should we not need that flexibility, the extension can be removed.

  input => 'foo.c', # foo.o: foo.c

  # Dependencies are either file names or objects, and as we'll see
later on
  # can be arrays of these as well.

  dependsOn => 'foo.c', #   cc -c foo.c
);

my $bar_o = CC(
  output => Object( input => 'bar' ),
  input => 'bar.c', # bar.o: bar.c
  dependsOn => 'bar.c', #   cc -c bar.c
);

# The link statement is also platform-sensitive, and introduces another
# platform-dependent directive, 'Executable'. This, of course, accounts
for
# the difference between Win32's '.exe' and UNIX's '' extension for file
# names. In due course, the file name here should be stated without
extension
# of any kind.
my $foo_exe = Link(
  output => Executable( input => 'foo' ),

  # Any directive can take an anonymous array of inputs, and they'll be
  # handled in the order declared. The Object declaration returns the
same
  # name every time, so we could concievably cache these in scalars as
well,
  # but to be pedantic I'm declaring them each time.
  input => [ Object(input=>'foo'),
 Object(input=>'bar') ], # foo: foo.o bar.o

  # And this can be dependent upon one or more files as well,
  # Take special note here that we're depending upon two *objects*, not
files.
  # When time comes to determine if the link target needs to be done,
the
  # script looks through these dependencies to see if 'foo.o' needs to
be
  # rebuilt as well as comparing timestamps with 'foo.exe' to see if the
link
  # needs to be rebuilt.
  dependsOn => [$foo_o, $bar_o], #   cc -o foo foo.o bar.o
);

# This is simply a directive that explicitly declares that 'foo' is a
target.
# Here, 'foo' is not an executable but a target name, that will be
looked up
# when 'make.pl foo' is run.
$depends->{foo} = Target(
  input => 'foo',
  dependsOn => $foo_exe,
);

--cut here--

This project is currently very much in an alpha state, but on my system
it does handle the very limited set of dependencies you see here very
well. Deleting files and rerunning make.pl does the right thing, as well
as touch'ing files. I haven't expanded the makefile beyond what you see
here yet, although once I'm certain of the foundation the number of
available directives will grow, as will such incidentals as
documentation and more error detection. 

Unlike the UNIX make tool, this application executes the required
actions in linear order in a single process, letting each action
complete before the next one is allowed to begin. In other words, no
parallelism and no determining of which actions can be allowed to
proceed in parallel without worries of race conditions.

I'll commit this later in the week once I make two major alterations to
the structure. The code as it stands builds the list of actions to be
taken in parallel with checking the graph to determine whether a
particular action needs to be taken or not. T

47 matches

Mail list logo