Arrays of PMCs
Does anyone have an idea of when we're going to see these? Or hashes of PMCs, I don't really care which... -- Piers "It is a truth universally acknowledged that a language in possession of a rich syntax must be in need of a rewrite." -- Jane Austen?
Re: Regex and Matched Delimiters
Larry Wall <[EMAIL PROTECTED]> writes: > /^pat$/m /^^pat$$/ $$ is no longer the current PID? Or will we have to call that '${$}' in a regex? -- Piers "It is a truth universally acknowledged that a language in possession of a rich syntax must be in need of a rewrite." -- Jane Austen?
Re: Regex and Matched Delimiters
"Brent Dax" <[EMAIL PROTECTED]> writes: > Larry Wall: > That's...odd. Is $$ (the variable) going away? > > # /./s// or /<.>/ ??? > > I think that . is too common a metacharacter to be relegated to > this. I think you failed to notice that '/s' on the regex. In general . will still mean . but if you want it to match *anything* including a new line, you have to call it <.>. Personally, I don't have a problem with that. > # space(or \h for "horizontal"?) > > Same thinking as '.'. The golfers aren't going to like it for sure. But most of the time when I'm doing production code I have /x turned on anyway, and in that context, if I want to match a space and only a space, I have to do [ ] anyway. It might be nice if we could have m:X// mean 'space and hash match themselves'. > # \t also > # \n also or (latter matching > logical newline) > # \r also > # \f also > # \a also > # \e also > > I can tell you right now that these are going to screw people up. > They'll try to use these in normal strings and be confused when it > doesn't work. And you probably won't be able to emit a warning, > considering how much CGI Perl munches. But assigning meaning to < and > is going to do that anyway. -- Piers "It is a truth universally acknowledged that a language in possession of a rich syntax must be in need of a rewrite." -- Jane Austen?
RE: Regex and Matched Delimiters
Piers Cawley: # "Brent Dax" <[EMAIL PROTECTED]> writes: # > Larry Wall: # > That's...odd. Is $$ (the variable) going away? # > # > # /./s // or /<.>/ ??? # > # > I think that . is too common a metacharacter to be # relegated to this. # # I think you failed to notice that '/s' on the regex. In # general . will still mean . but if you want it to match # *anything* including a new line, you have to call it <.>. # Personally, I don't have a problem with that. Ah, you're right. My bad. # > # space (or \h for "horizontal"?) # > # > Same thinking as '.'. # # The golfers aren't going to like it for sure. But most of the # time when I'm doing production code I have /x turned on # anyway, and in that context, if I want to match a space and # only a space, I have to do [ ] anyway. # # It might be nice if we could have m:X// mean 'space and hash # match themselves'. I was thinking that would replace \s. If that isn't the case, I have no real complaint (if you can turn off /x). # > # \talso # > # \nalso or (latter matching # > logical newline) # > # \ralso # > # \falso # > # \aalso # > # \ealso # > # > I can tell you right now that these are going to screw people up. # > They'll try to use these in normal strings and be confused when it # > doesn't work. And you probably won't be able to emit a warning, # > considering how much CGI Perl munches. # # But assigning meaning to < and > is going to do that anyway. Not if the things are meaningless outside of regexes. For example, lookahead sequences make absolutely no sense in a quoted string. --Brent Dax <[EMAIL PROTECTED]> @roles=map {"Parrot $_"} qw(embedding regexen Configure) #define private public --Spotted in a C++ program just before a #include
Re: Regex and Matched Delimiters
Larry Wall <[EMAIL PROTECTED]> writes: [...] > /pat/x/pat/ How do I do a "no /x"? I know that commented /x'ed regexps are easier reading (I even write them myself, I swear I do!), but having to escape whitespace is often very annoying. Will I really have to escape all spaces (or use , below)? This also marks a significant departure from UN*X-style regexps. One reason learning Perl's regexp language was so convenient (to me) was that that most of what I knew of UN*X regexps was applicable. Changing the behaviour of a rather useful character (like ASCII 32) is going to produce many references to the FAQ "Why doesn't /a word/ match 'a word'?". (Having to escape #s is not as bad, as they are less common). [...] -- Ariel Scolnicov|http://3w.compugen.co.il/~ariels Compugen Ltd. |[EMAIL PROTECTED] 72 Pinhas Rosen St.|Tel: +972-3-7658117 "fast, good, and cheap; Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555 pick any two!"
Re: Regex and Matched Delimiters
> /pat/i m:i/pat/ or // or even m ??? Why lose the modifier-following-final-delimiter syntax? Is this to avoid a parsing issue, or because it's linguistically odd to have a modifier at the end? > /^pat$/m /^^pat$$/ What's the mnemonic here? It feels the wrong way round -- like a single ^ or $ should match at newlines, double ^ or $ should only match at start/end string. Ah. The newline matches between the ^^ or $$. That works. Then there's the PID issue. Hmm. How to save $$ (it is nice for one liners)? Sorry if this is a dumb suggestion, but could you have just one assertion, say ^$, that alternates matching just before and just after a newline? > /./s // or /<.>/ ??? I'd expect . to match newlines by default. For a . that didn't match newlines, I'd expect to need to use [^\n]. > space (or \h for "horizontal"?) Can one quote a substring of a regex? In a later part you say that \Q...\E is going away, so it seems not. It would be nice to say something like: /foo bar baz 'qux waldo' emerson/ and have the space between qux and waldo be literal. Similar arguments apply more broadly so that one could escape the usual meaning of metacharacters etc. > \Lstring\E \L > \Ustring\E \U Maybe, if I wasn't too far off with the quote mark suggestion above, then \L'string' would be more natural. > (?#...) {"..."} :-) Will plain # comments work in p6 regexen? > (?:...) <:...> > (?=...) > (?!...) > (?<=...) > (? > (?>...) Hmm. So <> are clustering just like (). One difference is that () always capture whereas <> only do so sometimes. Oh, and {} can too. () are no longer used for clever stuff, <> are instead. And {}. Hmm. Time for bed. -- ralph
Re: Please rename 'but' to 'has'.
On Mon, 2002-04-22 at 19:22, Larry Wall wrote: > Perl 6 will try to avoid synonyms but make it easy to declare them. At > worst it would be something like: > > my sub operator:now ($a,$b) is inline { $a but $b } I see your point, and it makes sense, but how will precedence work? What would this do: $i now foo but bar and 2; or this: $i but foo now bar and 2; What if I want to define a synonym for and? sub operator:also ($a,$b) is inline { $a and $b } print $_ also die; Scratching my head here in userville
Re: Regex and Matched Delimiters
On Mon, 2002-04-22 at 21:53, Larry Wall wrote: > * Parens always capture. > * Braces are always closures. > * Square brackets are always character classes. > * Angle brackets are always metasyntax (along with backslash). > > So a first whack at the differences might be: [...] > space (or \h for "horizontal"?) > {n,m} > > \talso I want to know how he does this!! We sit around scratching out heads looking for a syntax that fits and refines and he jumps in with something that redefines and simplifies. Larry is wasted on Perl. He needs to run for office ;-) > \Lstring\E\L > \Ustring\E\U This one boggles me. Wouldn't that be something like: or string # ;-) Seriously, it seems that "\L" would be confusing. > \Q$var\E $varalways assumed literal, so $1 is literal backref > $var <$var> assumed to be regex Very nice. I can get behind this, and a lot of people will thank you who have to maintain code. > =~ $re=~ /<$re>/ ouch? If $re is a regexp, wouldn't "$str =~ $re" turn into "$re.match($str)"? Perhaps "$re.m $str" which is no more typing and pretty clear to me. > Obviously the and syntaxes will be user extensible. > We have to be able to support full grammars. I consider it a feature > that looks like a non-terminal in standard BNF notation. I do > not consider it a misfeature that resembles an HTML or XML tag, > since most of those languages need to be matched with a fancy rule > named anyway. It's too bad that would be messy with standard Perl //-enclosed regexes, as it would be a nice way to pass parameters to user-defined tags. It would also allow XML-like propagation of results: xyz
RE: Regex and Matched Delimiters
> # =~ $re =~ /<$re>/ ouch? > > I don't see the win. Naturally =~ $re is a bit cleaner, but we can't do that because =~ is smart match, not regex match. > # (?=...) > # (?!...) > # (?<=...) > # (? > > Cute. (Wait a minute, aren't those reversed?) Hehe. I thought that was cool. /foobar/ / foobar/ You see, foobar before snafoo, which is what it is. After snafoo, foobar. It reads very nicely. Luke
Re: Regex and Matched Delimiters
* Larry Wall ([EMAIL PROTECTED]) [23 Apr 2002 11:56]: [...] > * Parens always capture. Maybe I missed something in the rest of the details, but is anything going to replace non-capturing parens? It's just that I do find them quite useful. -- iain.
Re: Regex and Matched Delimiters
On Wed, 24 Apr 2002, Iain Truskett wrote: > * Larry Wall ([EMAIL PROTECTED]) [23 Apr 2002 11:56]: > > [...] > > * Parens always capture. > > Maybe I missed something in the rest of the details, but is anything > going to replace non-capturing parens? It's just that I do find them > quite useful. Yes. /indeed <:this>+ wont capture/
Re: Regex and Matched Delimiters
On Tue, 2002-04-23 at 04:32, Ariel Scolnicov wrote: > Larry Wall <[EMAIL PROTECTED]> writes: > > [...] > > > /pat/x /pat/ > > How do I do a "no /x"? I know that commented /x'ed regexps are easier > reading (I even write them myself, I swear I do!), but having to > escape whitespace is often very annoying. Will I really have to > escape all spaces (or use , below)? > I'm not sure that that's a bad thing. Regular expressions are the hairiest, ugliest thing in Perl. If they change in this way, I see them getting a tad more verbose, and a whole lot more readable and maintainable. Besides you can always do this: $str = "COPYING file for more information"; /$str/ since scalars will be interpolated as quoted by default.
Re: Please rename 'but' to 'has'.
Aaron Sherman writes: : On Mon, 2002-04-22 at 19:22, Larry Wall wrote: : : > Perl 6 will try to avoid synonyms but make it easy to declare them. At : > worst it would be something like: : > : > my sub operator:now ($a,$b) is inline { $a but $b } : : I see your point, and it makes sense, but how will precedence work? What : would this do: : : $i now foo but bar and 2; : : or this: : : $i but foo now bar and 2; : : What if I want to define a synonym for and? : : sub operator:also ($a,$b) is inline { $a and $b } : print $_ also die; : : Scratching my head here in userville Precedence is set with the "like' property: my sub operator:now ($a,$b) is like("but") is inline { $a but $b } sub operator:also ($a,$b) is like("and") is inline { $a and $b } Larry
Re: Please rename 'but' to 'has'.
At 08:58 AM 04-23-2002 -0700, Larry Wall wrote: >Precedence is set with the "like' property: > > my sub operator:now ($a,$b) is like("but") is inline { $a but $b } > sub operator:also ($a,$b) is like("and") is inline { $a and $b } OK, but that limits you to the, um, 24 standard levels of precedence. What do you do if you don't think that that's enough. Let's say you want to define a "nand" operator: my sub operator:nand ($a, $b) is inline { not ($a and $b) } but you want nand to have a precedence lower than the existing 'and' but higher than the existing 'or' (for some reason I can't imagine offhand). It isn't like() anything, since there isn't anything currently between 'and' and 'or'. Would that be something like: my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b) }
Re: Regex and Matched Delimiters
Brent Dax writes: : # ?pat? // or even m ??? : : Whoa, those are moving to the front?!? The problem with options in general is that they can't easily modify parsing if they come in back. Now in the particular case of /f and /i, it probably doesn't matter. But I was trying to see if there was some way to do away with trailing options altogether. This might even extend to things like: qq:s"$interpolates @doesn't %doesn't" And that's definitely a situation where it changes the parse. Hmm, if strings have options, they're probably addititive, so to add scalar interpolation you'd want to base it on "q", not "qq": q:s"$interpolates @doesn't %doesn't" On the other hand, that doesn't work for the other things like "qr", so maybe any of :s, :a, :h turn off default interpolations, so qr:a would only interpolate arrays, for instance. : # /pat/x /pat/ : # /^pat$/m/^^pat$$/ : : That's...odd. Is $$ (the variable) going away? Maybe. It'd be $*PID if so, since it's truly global to the process. But if not, we could special case $$ inside regexes, just as we already special case $ itself. : # \p{prop}<+prop> ??? : # \P{prop}<-prop> ??? : : Intriguing. Yeah, especially when you start stacking them. But maybe we're treading on [...] territory. It could be argued that <...> is just a generalized form of POSIX's [:...:] construct : # \t also : # \n also or (latter matching : logical newline) : # \r also : # \f also : # \a also : # \e also : : I can tell you right now that these are going to screw people up. : They'll try to use these in normal strings and be confused when it : doesn't work. And you probably won't be able to emit a warning, : considering how much CGI Perl munches. I can see pragmatic variants in which those *do* interpolate by default. And pragmatic variants where they don't. : # \033same : # \x1Bsame : # \x{263a}\x<263a> ??? : : Why? Wouldn't we want the same thing to work in quoted strings? (Or : are those changing syntaxes too?) I'm just wondering how far I can drive the principle that {} is always a closure (even though it isn't). I admit that it's probably overkill here, which is why there are question marks. : # \c[ same : # \N{name} : # \l same : # \u same : # \Lstring\E \L : # \Ustring\E \U : : So that's changed from whenever you talked about \q{} ? Possibly. Again, the question is whether {} more strongly imply something that's not true. But curlies were so overloaded in Perl 5 that I don't think people are going to necessarily expect them to do only one thing. Still, if <> are taking over the role of "unmarked metasyntactic delimiters", maybe they belong here too. : # \E gone : # [\040\t]\hplus any Unicode horizontal whitespace : # [\r\n\ck] \v plus any Unicode vertical whitespace : #=20 : # \b same : # \B same : : # \A ^ : # \Z same? : # \z $ : : Are you sure that optimizes for the common case? No, I'm not sure, but we have to clean up the \A...\z mess somehow. : # \G , but assumed in nested patterns? : # =20 : # \1 $1 : #=20 : # \Q$var\E$varalways assumed literal, so $1 is literal : backref : : So these are reinterpolated every time you backtrack? Are you *trying* : to destroy regex performance? :^) They're not interpolated. They're matched, as in string comparison, just as backrefs are matched right now. : # $var<$var> assumed to be regex : : What if $var is a qr//ed object? Then it's a pretty easy assumption that it's a regex. :-) : # =~ $re =~ /<$re>/ ouch? : : I don't see the win. No difference if $re is qr//, but if it's not, that is the syntax for forcing $re to be interpreted as a regex. : # (??{$rule}) : # (?{ code }) { code } with failure semantics : # (?#...) {"..."} :-) : # (?:...) <:...> : # (?=3D...) : # (?!...) : # (?<=3D...) : # (? : : Cute. (Wait a minute, aren't those reversed?) Nope, I realized they were ambiguous depending on whether you think of them as declarative or operational, but I settled on the declarative reading because it works with their being assertions. All the other options I could think of are either really clunky or similarly ambigu
Re: Regex and Matched Delimiters
Aaron Sherman writes: : On Mon, 2002-04-22 at 21:53, Larry Wall wrote: : : > * Parens always capture. : > * Braces are always closures. : > * Square brackets are always character classes. : > * Angle brackets are always metasyntax (along with backslash). : > : > So a first whack at the differences might be: : [...] : > space(or \h for "horizontal"?) : > {n,m} : > : > \t also : : I want to know how he does this!! Could have something to do with the fact that I've been banging my head against this for a couple of months already... : We sit around scratching out heads : looking for a syntax that fits and refines and he jumps in with : something that redefines and simplifies. Larry is wasted on Perl. He : needs to run for office ;-) Agh, no! I'm okay at simplifying, but I'm terrible at oversimplifying. : > \Lstring\E \L : > \Ustring\E \U : : This one boggles me. Wouldn't that be something like: : : or string # ;-) Well, makes sense only if <> works in ordinary double quotes. : Seriously, it seems that "\L" would be confusing. Potentially, except that you almost never use it on anything but variable interpoations. So \L<$foo> would be a better example. The confusing thing is that $foo would not be assumed to be a regular expression, whereas it would in bare <$foo> (at least in a regex). : > \Q$var\E$varalways assumed literal, so $1 is literal :backref : > $var<$var> assumed to be regex : : Very nice. I can get behind this, and a lot of people will thank you who : have to maintain code. Well, almost anything is an improvement over the current syntax. : > =~ $re =~ /<$re>/ ouch? : : If $re is a regexp, wouldn't "$str =~ $re" turn into "$re.match($str)"? : Perhaps "$re.m $str" which is no more typing and pretty clear to me. Sure, but I was illustrating the situation of a non-qr string being forced to be a regex. : > Obviously the and syntaxes will be user extensible. : > We have to be able to support full grammars. I consider it a feature : > that looks like a non-terminal in standard BNF notation. I do : > not consider it a misfeature that resembles an HTML or XML tag, : > since most of those languages need to be matched with a fancy rule : > named anyway. : : It's too bad that would be messy with standard Perl //-enclosed : regexes, as it would be a nice way to pass parameters to user-defined : tags. It would also allow XML-like propagation of results: : : xyz Gee, maybe we could make a way for people to use alternate dilimiters like they've always done with s///. :-) Larry
RE: Regex and Matched Delimiters
Sorry to reply to the same message twice, but I just noticed something. Larry Wall: # {n,m} Isn't that the only use of angle brackets as a quantifier? That's going to make parsing more difficult... --Brent Dax <[EMAIL PROTECTED]> @roles=map {"Parrot $_"} qw(embedding regexen Configure) #define private public --Spotted in a C++ program just before a #include
Re: Please rename 'but' to 'has'.
In reply to Buddha Buck <[EMAIL PROTECTED]>: > At 08:58 AM 04-23-2002 -0700, Larry Wall wrote: > >Precedence is set with the "like' property: > > > > my sub operator:now ($a,$b) is like("but") is inline { $a but $b > } > > sub operator:also ($a,$b) is like("and") is inline { $a and $b } > > OK, but that limits you to the, um, 24 standard levels of precedence. > What > do you do if you don't think that that's enough. Let's say you want to > > define a "nand" operator: > > my sub operator:nand ($a, $b) is inline { not ($a and $b) } > > but you want nand to have a precedence lower than the existing 'and' but > > higher than the existing 'or' (for some reason I can't imagine > offhand). It isn't like() anything, since there isn't anything > currently > between 'and' and 'or'. Would that be something like: > > my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b) > } > 24 levels of precedence should be enough, else you can always resort to parens. Guillaume
Re: Regex and Matched Delimiters
Me writes: : > /pat/i m:i/pat/ or // or even m ??? : : Why lose the modifier-following-final-delimiter : syntax? Is this to avoid a parsing issue, or : because it's linguistically odd to have a modifier : at the end? Haven't decided for sure to lose it, but it does have several problems. First is the parsing issue, but there's also what in natural language is called the "end weight" problem. We often rearrange our sentences in English so that the short things come first and the long things come last. That's why you choose indirect object syntax sometimes and not others. Try turning either of these to the other form: I gave him a big, smelly tuna-fish and cucumber sandwich. I gave the sandwich to a big, smelly tuna fisherman and his dog "Cucumber". Now, options are always little, so it seems that they should come early. : > /^pat$/m /^^pat$$/ : : What's the mnemonic here? It feels the wrong : way round -- like a single ^ or $ should match : at newlines, double ^ or $ should only match : at start/end string. Well, I though of it as ^^ or $$ matching potentially multiple places in the string. : Ah. The newline matches between the ^^ or $$. : That works. Except that the newline doesn't match between the characters. You could say /$$\n^^/ for instance. : Then there's the PID issue. Hmm. How to save $$ : (it is nice for one liners)? $PID is only two chars worse. (The * of $*PID is optional.) : Sorry if this is a dumb suggestion, but could you have : just one assertion, say ^$, that alternates matching : just before and just after a newline? ^$ matches a null string. That aside, I don't think stateful assertions would be unconfusing in the extreme. : > /./s // or /<.>/ ??? : : I'd expect . to match newlines by default. For a . that : didn't match newlines, I'd expect to need to use [^\n]. But . has never matched newlines by default, not even in grep. Possibly some editors do it that way, but if so, it's non-standard. : > space (or \h for "horizontal"?) : : Can one quote a substring of a regex? In a later part you : say that \Q...\E is going away, so it seems not. It would be : nice to say something like: : : /foo bar baz 'qux waldo' emerson/ : : and have the space between qux and waldo be literal. : Similar arguments apply more broadly so that one : could escape the usual meaning of metacharacters etc. Well, <"qux waldo"> could be made to mean that, I suppose. For that matter, so might \q{qux waldo}. Er, \q? : > \Lstring\E \L : > \Ustring\E \U : : Maybe, if I wasn't too far off with the quote mark : suggestion above, then \L'string' would be more : natural. Maybe \L and \q are in the same class, in which case that would work. : > (?#...) {"..."} :-) : : Will plain # comments work in p6 regexen? Yes, just as in /x. And there's no ambiguity in the end delimiter any more because we parse in one pass. : > (?:...) <:...> : > (?=...) : > (?!...) : > (?<=...) : > (? : > (?>...) : : Hmm. So <> are clustering just like (). Yes, and you can quantify them where it makes sense. : One difference is that () always capture whereas <> : only do so sometimes. Oh, and {} can too. Eh? <> never capture. None of those constructs above capture. Nothing inside a {} can capture anything that influences the paren count outsid the {}, because any inner regex has its own paren count. : () are no longer used for clever stuff, <> are instead. : And {}. Basically, yes. : Hmm. Time for bed. Why? I just got up. :-) Larry
Re: Please rename 'but' to 'has'.
At 01:12 PM 04-23-2002 -0400, [EMAIL PROTECTED] wrote: >24 levels of precedence should be enough, else you can always resort to >parens. I would have agreed, except that I would have also said that the 14 precedence levels of C should be enough as well -- yet we seem to have discovered uses for 10 more. >Guillaume
Re: Please rename 'but' to 'has'.
Buddha Buck writes: : At 08:58 AM 04-23-2002 -0700, Larry Wall wrote: : >Precedence is set with the "like' property: : > : > my sub operator:now ($a,$b) is like("but") is inline { $a but $b } : > sub operator:also ($a,$b) is like("and") is inline { $a and $b } : : OK, but that limits you to the, um, 24 standard levels of precedence. What : do you do if you don't think that that's enough. Let's say you want to : define a "nand" operator: : : my sub operator:nand ($a, $b) is inline { not ($a and $b) } : : but you want nand to have a precedence lower than the existing 'and' but : higher than the existing 'or' (for some reason I can't imagine : offhand). It isn't like() anything, since there isn't anything currently : between 'and' and 'or'. Would that be something like: : : my sub operator:nand ($a, $b) is below("and") is inline {not ($a and $b) } Yes, that's what I was thinking. And the dimensions shrink every time you do that, so if something is "above" your C, it doesn't go back to being the same as C. Though since people can't seem to keep up and down straight on their precedence charts, I'd go for "tighter" and "looser" or some such. I think I'm even on the record somewhere about that. Larry
Re: Please rename 'but' to 'has'.
At 12:36 PM -0400 4/23/02, Buddha Buck wrote: >At 08:58 AM 04-23-2002 -0700, Larry Wall wrote: >>Precedence is set with the "like' property: >> >> my sub operator:now ($a,$b) is like("but") is inline { $a but $b } >> sub operator:also ($a,$b) is like("and") is inline { $a and $b } > >OK, but that limits you to the, um, 24 standard levels of >precedence. What do you do if you don't think that that's enough Internally precedence is going to be stored as a floating-point number. Dunno how it'll be exposed at the language level, but at least there'll be more than just 20 or so levels. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Regex and Matched Delimiters
Brent Dax writes: : Sorry to reply to the same message twice, but I just noticed something. : : Larry Wall: : # {n,m} : : Isn't that the only use of angle brackets as a quantifier? That's going : to make parsing more difficult... How so? It's just a one-character lookahead to see if it's a digit. But we could actually use a more general syntax: Larry
[PATCH] Re: Arrays of PMCs
On Mon, Apr 22, 2002 at 05:40:09PM +0100, Piers Cawley wrote: > Does anyone have an idea of when we're going to see these? Or hashes > of PMCs, I don't really care which... Well, we don't have hashes of anything. We already have arrays of PMCs. You just can't get the PMCs out, only their integer or numeric values. :-) True arrays of PMCs should probably be blocked on the whole keyed thing. Keyed access is partially implemented, but there's way too much manual code repetition at the moment, and the assembly syntax is wrong: EVENTUALCURRENT set I0, P0[7] get_keyed I0, P0, 7 set P0[7], I0 set_keyed P0, 7, I0 set P0[0], P1[1]not possible set I0, P0[P1] not possible -- I'm not even sure what this will do set P1, P0[7] get_keyed P1, P0, 7 (requires the recently committed patch) set P0[7], P1 set_keyed P0, 7, P1 (requires the recently committed patch) So far, I've just kind of thrown in more and more [sg]et_keyed variants as they were needed. To continue in this grand tradition, I've just committed a patch to allow getting and setting of the PMCs in arrays. However, I'm not really sure how 'set P0[7], P1' is supposed to behave. I just overwrite the whole P0[7] array slot, discarding the previously held PMC. I don't remember the whole 'set P0, P1' discussion well enough to venture an opinion on whether the previous occupant gets to have a say in what happens. This code now works: # P0 is initialized to an array containing the command-line arguments new P1, PerlArray set_keyed P1, 0, P0 # set P1[0], P0 get_keyed P31, P1, 0 # set P31, P1[0] get_keyed S0, P31, 0 # set S0, P31[0] print "Command name: " print S0 print "\n" end
Re: [PATCH] Re: Arrays of PMCs
Oops, forgot to change the subject line. No patch. Patch already committed.
Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5
At 12:25 PM +0200 4/19/02, Peter Gibbs wrote: >Mike Lambert wrote: >> Undoing the patch in resources.c seems to fix the problem. >> >> Changing: >> ((Buffer *)buffer)->buflen = req_size; >> to: >> ((Buffer *)buffer)->buflen = size; >> makes it work again. > >Just for interest, the problem here is that the rounding is always up to the >next multiple of 16. So, for example, a zero-length string would have buflen >set to 16 (actually it is set back to zero in string_make, but that just >slows the process down slightly); string_copy would ask for a buffer of 16 >and get back a buffer of 32, etc, so every time a string is copied, it grows >by 16 bytes. That's not true. Since we're copying and allocating based on the original length, we're not going to grow. However, the point is well-taken--having a version that allocates and returns the real length in the buffer's useful for strings. I'm going to add one in a minute here. >This effect is exacerbated by the fact that "set S1, S2" does a >string_copy - I am still not sure what is supposed to happen here; I believe >that the pure set opcode should just be doing a register copy?? There is a >clone opcode which also does a string_copy, which seems reasonable. set S0, S1 is broken. I'm fixing that now. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Using Parrot
At 6:19 PM +0100 4/19/02, Alberto Manuel Brandão Simões wrote: >But, this e-mail is not to say this, but to request some kind of help. >I'm used to check-out, compile and test parrot, looked at the language >(well, a long time ago) and I'm needing to look to it again. The >question is, what documents do you think I should read to start quickly >using Parrot? PDD's, any pod from Parrot cvs tree... any other thing? Did you get sufficient information? If not, where have we left gaps? FWIW, i'd love to be kept updated on the state of your project. A link to it from the parrotcode.org website would also be in order, I think. (I need to pass on info on Cardinal, the Ruby on Parrot project, soon) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: goto ADDRESS()
At 7:42 PM +0200 4/19/02, Marco Baringer wrote: >Dan Sugalski <[EMAIL PROTECTED]> writes: > >> Ah, this is incorrect. goto ADDRESS should go to an absolute address, >> period. It's for use in those times when you *have* an absolute >> address--for example when you've just fetched the address of a >> subroutine from a symbol table. > >but what do i put in the symbol table? The absolute address. >every address i have in the >symbol is, unless my understanding is severly flawed, determined at >compile time by the assembler and is relative to the start of the byte >code. The addresses are always absolute, and are determined at load or runtime. Branches are relative, and can be determined at compile time. >if i have code like (perl5 syntax to avoid confusion) > >my $f = sub { print "hello" }; >$f->(); > >i imagine that will become more or less: (in pseudo pasm) > >closure_000: > print "hello" > ret > >main: > set_sym P0, '$f', [closure_000] > fetch_sym I0, P0, '$f' > jsr I0 Something like that. But in this case, the set_sym (which'll probably be 'make_closure' or something) will store the absolute address into the symbol table or PMC in P0. > > Jumping from the start of the >> bytecode segment is an interesting idea, but since it's only valid >> when used to transfer control from within the current segment, you >> might as well just use goto OFFSET instead. > >sorry, but i don't understand. There's no real functional difference between "offset from current spot" and "offset from beginning of segment", so I'm unwilling to have two separate relative addressing modes. > > -- > >all i want to be able to do is: > >set I0, [whatever] > >jsr I0 > >or > >set I0, [wherever] > >jump I0 > >and as far as i know i can't currently do this. Right. There's currently no way to get the absolute address of a label. That should be fixed--in fact, it will be as soon as the current run of tests I've got going are done and checked in. (Okay, after they're done since they're for other things, but...) it really looks like you're trying to work around a deficiency in the current scheme. Better to fix the deficiency. :) >post scriptum - did my patch to lib/Parrot/Assembler get lost in the haze or >was there something wrong with it? If it's not in, it got lost in the haze. Which assembler was it against?0 >post post scriptum - i noticed a mention of #parrot in some email, >which network is that on? irc.rhizomatic.net. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [PATCH] Revised TODO list, again
At 1:10 PM -0700 4/19/02, Steve Fink wrote: >This one got dropped too, and maybe this isn't the right place for >this anymore. Applied. Sorry for the wait. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5
On Tue, 23 Apr 2002, Dan Sugalski wrote: > At 12:25 PM +0200 4/19/02, Peter Gibbs wrote: > >Mike Lambert wrote: > > >This effect is exacerbated by the fact that "set S1, S2" does a > >string_copy - I am still not sure what is supposed to happen here; I believe > >that the pure set opcode should just be doing a register copy?? There is a > >clone opcode which also does a string_copy, which seems reasonable. > > set S0, S1 is broken. I'm fixing that now. And here's a test for that. (By the way, is there any way to test it more directly?) Simon --- t/op/string.t.old Tue Apr 23 15:42:36 2002 +++ t/op/string.t Tue Apr 23 15:49:40 2002 @@ -1,6 +1,6 @@ #! perl -w -use Parrot::Test tests => 76; +use Parrot::Test tests => 77; output_is( <<'CODE', <
Mutable vs immutable strings
Okay folks, time to hash this out once and for all. Should strings in parrot be mutable or immutable? Right now we've a mix, and that's untenable. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: Mutable vs immutable strings
Dan Sugalski: # Okay folks, time to hash this out once and for all. # # Should strings in parrot be mutable or immutable? Right now we've a # mix, and that's untenable. Three questions: 1. Which'll be faster? 2. Which'll be simpler? 3. Which is more important? --Brent Dax <[EMAIL PROTECTED]> @roles=map {"Parrot $_"} qw(embedding regexen Configure) #define private public --Spotted in a C++ program just before a #include
Re: [netlabs #522] BASIC hangs and crashes, Win32 MSVC++, 0.0.5[APPLIED]
At 3:55 PM -0400 4/23/02, Simon Glover wrote: >On Tue, 23 Apr 2002, Dan Sugalski wrote: > >> At 12:25 PM +0200 4/19/02, Peter Gibbs wrote: >> >Mike Lambert wrote: >> >> >This effect is exacerbated by the fact that "set S1, S2" does a >> >string_copy - I am still not sure what is supposed to happen >>here; I believe >> >that the pure set opcode should just be doing a register copy?? There is a >> >clone opcode which also does a string_copy, which seems reasonable. >> >> set S0, S1 is broken. I'm fixing that now. > > And here's a test for that. (By the way, is there any way to test it more > directly?) Applied, thanks. (And no, not at the moment) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[PATCH] Fix Read with new allocate_about
This should hopefully fix a problem Clint noticed with his LOAD bug, assuming he is using this op. The code was assuming that a string_make's passed len==buflen, which is no longer the case. Mike Lambert Index: core.ops === RCS file: /cvs/public/parrot/core.ops,v retrieving revision 1.126 diff -r1.126 core.ops 370c370 < s->bufused = s->buflen; --- > s->bufused = len;
Re: [PATCH] Fix Read with new allocate_about [APPLIED]
At 4:54 PM -0400 4/23/02, Mike Lambert wrote: >This should hopefully fix a problem Clint noticed with his LOAD bug, >assuming he is using this op. The code was assuming that a string_make's >passed len==buflen, which is no longer the case. Applied, thanks. (BTW, could you use either -p or -u for patches? Makes patch happier) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [PATCH] Make obscure.ops work
At 7:51 PM -0700 4/20/02, Chip Salzenberg wrote: >I realize that obscure.ops isn't a big deal, but why not make >it work? Thus this patch. This patch eliminates the versions >of the ops that accept integers, under the assumption that trig >on integers is extraordinarily silly. Applied, thanks. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Regex and Matched Delimiters
On Tue, 2002-04-23 at 12:48, Larry Wall wrote: > Brent Dax writes: > : # \talso > : # \nalso or (latter matching > : logical newline) > : # \ralso > : # \falso > : # \aalso > : # \ealso > : > : I can tell you right now that these are going to screw people up. > : They'll try to use these in normal strings and be confused when it > : doesn't work. And you probably won't be able to emit a warning, > : considering how much CGI Perl munches. > > I can see pragmatic variants in which those *do* interpolate by default. > And pragmatic variants where they don't. If you put them in one, put them in the other, HOWEVER, there's a strong pragmatic reason for neither that i can see. HTML/XML/SGML I hate to say it, but if <> interpolates in everything cleanly with no overloading, the *ML camps will thank you deeply. How often I've written: qq{$content} I cannot tell you, but it's large. Why not use {} for this and add an {eval:code}? > I'm just wondering how far I can drive the principle that {} is always > a closure (even though it isn't). I admit that it's probably overkill > here, which is why there are question marks. I like the idea, but I don't think it fits. On the other hand, if inside all interpolating operators {} is the special thing that gets interpolated (and NOTHING else), I could see liking the new look: qq{a${x}b} => qq{a{$x}b} qr{a\Q${x}\Eb$} => qr{a{q:$x}b$} qr{a${x}b$} => qr{a{$x}b$} q{a}.eval($x).q{b} => qq{a{e:$x}b} or qq{a{{$x}}b} "ajs\@ajs.com" => qq{[EMAIL PROTECTED]} "ajs". @{ajs} .".com" => qq{ajs{@ajs}.com} I know it's a departure from your original idea, but it certainly unifies the syntax nicely: qq{Hello, World!{nl}} qr{Hello, World!{nl}} > With respect to Perl 5, I'm trying to unhijack curlies as much as possible. Ooops :-)
Re: Mutable vs immutable strings
G'day all. On Tue, Apr 23, 2002 at 01:18:23PM -0700, Brent Dax wrote: > Three questions: > > 1. Which'll be faster? It depends on the application, but my money is on mutable strings built on top of an immutable buffer. That's based on looking at my own string-based Perl code, a lot of which is substring extraction (usually by regular expression). It may pay off if a string and its substrings can share implementation. > 2. Which'll be simpler? Immutable strings are definitely simpler, when you have garbage collection. > 3. Which is more important? At the risk of stating the obvious, it is more important for the interface to be complete. Cheers, Andrew Bromage
RE: Mutable vs immutable strings
Andrew J Bromage: # On Tue, Apr 23, 2002 at 01:18:23PM -0700, Brent Dax wrote: # # > Three questions: # > # > 1. Which'll be faster? # # It depends on the application, but my money is on mutable # strings built on top of an immutable buffer. That's based on # looking at my own string-based Perl code, a lot of which is # substring extraction (usually by regular expression). It may # pay off if a string and its substrings can share implementation. That's what I thought. # > 2. Which'll be simpler? # # Immutable strings are definitely simpler, when you have # garbage collection. # # > 3. Which is more important? # # At the risk of stating the obvious, it is more important for # the interface to be complete. The interface can be complete either way. It's how fast the code behind the interface is that'll vary. --Brent Dax <[EMAIL PROTECTED]> @roles=map {"Parrot $_"} qw(embedding regexen Configure) #define private public --Spotted in a C++ program just before a #include
[PATCH] Remove prederef's reliance on shared libraries
This is a rather clumsy patch to make prederef mode work without needing to be compiled as a shared library. In fact, it prevents it from being used as a shared library (but it's trivial to revert to the former behavior; see the patch.) Anyone who wishes is welcome to figure out exactly what's going on with shared oplibs. I believe prederef mode contains a valiant start at them, but at the moment the inability to compare all three different modes of operation with a single binary is just an annoyance. (And that 3 should really be 4; the computed goto should just be another option IMHO.) Index: config_h.in === RCS file: /home/perlcvs/parrot/config_h.in,v retrieving revision 1.26 diff -u -r1.26 config_h.in --- config_h.in 22 Mar 2002 18:06:46 - 1.26 +++ config_h.in 24 Apr 2002 03:32:37 - @@ -55,6 +55,7 @@ #define PARROT_CORE_OPLIB_NAME "core" #define PARROT_CORE_OPLIB_INIT Parrot_DynOp_core_${MAJOR}_${MINOR}_${PATCH} +#define PARROT_CORE_PREDEREF_OPLIB_INIT +Parrot_DynOp_core_prederef_${MAJOR}_${MINOR}_${PATCH} #define INTVAL_FMT "${intvalfmt}" #define FLOATVAL_FMT "${floatvalfmt}" Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.84 diff -u -r1.84 interpreter.c --- interpreter.c 15 Apr 2002 18:05:18 - 1.84 +++ interpreter.c 24 Apr 2002 03:32:38 - @@ -104,7 +104,7 @@ static op_func_t *prederef_op_func = NULL; static void -init_prederef(struct Parrot_Interp *interpreter) +init_prederef(struct Parrot_Interp *interpreter, BOOLVAL dynamic) { char file_name[50]; char func_name[50]; @@ -122,9 +122,9 @@ * Get a handle to the library file: */ -prederef_oplib_handle = Parrot_dlopen(file_name); +if (dynamic) prederef_oplib_handle = Parrot_dlopen(file_name); -if (!prederef_oplib_handle) { +if (dynamic && !prederef_oplib_handle) { internal_exception(PREDEREF_LOAD_ERROR, "Unable to dynamically load oplib file '%s' for oplib '%s_prederef' version %s!\n", file_name, PARROT_CORE_OPLIB_NAME, PARROT_VERSION); @@ -134,9 +134,14 @@ * Look up the init function: */ -prederef_oplib_init = -(oplib_init_f)(ptrcast_t)Parrot_dlsym(prederef_oplib_handle, - func_name); +if (dynamic) { +prederef_oplib_init = +(oplib_init_f)(ptrcast_t)Parrot_dlsym(prederef_oplib_handle, + func_name); +} else { +extern op_lib_t * PARROT_CORE_PREDEREF_OPLIB_INIT(void); +prederef_oplib_init = PARROT_CORE_PREDEREF_OPLIB_INIT; +} if (!prederef_oplib_init) { internal_exception(PREDEREF_LOAD_ERROR, @@ -202,13 +207,13 @@ */ static void -stop_prederef(void) +stop_prederef(BOOLVAL dynamic) { prederef_op_func = NULL; prederef_op_info = NULL; prederef_op_count = 0; -Parrot_dlclose(prederef_oplib_handle); +if (dynamic) Parrot_dlclose(prederef_oplib_handle); prederef_oplib = NULL; prederef_oplib_init = (oplib_init_f)NULLfunc; @@ -371,7 +376,7 @@ code_start_prederef = pc_prederef; -init_prederef(interpreter); +init_prederef(interpreter, 0); while (pc_prederef) { pc_prederef = @@ -379,7 +384,7 @@ interpreter); } -stop_prederef(); +stop_prederef(0); if (pc_prederef == 0) { pc = 0; Index: docs/running.pod === RCS file: /home/perlcvs/parrot/docs/running.pod,v retrieving revision 1.7 diff -u -r1.7 running.pod --- docs/running.pod25 Mar 2002 18:41:51 - 1.7 +++ docs/running.pod24 Apr 2002 03:32:38 - @@ -27,8 +27,12 @@ That's because we use fixed address for registers, this problem will be solved soon. -Prederef mode only works as a shared library. For example, on most -Unix platforms: +Prederef mode should work for all programs. + +It previously only worked as a shared library. To revert to that +state, find the calls to init_prederef and stop_prederef in +interpreter.c, and pass a true value instead of zero as the sole +argument. Then, on most Unix platforms: make clean make shared
Another [PATCH]: allow deactivating computed goto
On Tue, Apr 23, 2002 at 08:54:56PM -0700, Steve Fink wrote: > (And that 3 should really be 4; the computed goto should > just be > another option IMHO.) Maybe not so humble: here's a patch to disable the default computed goto core, so you can compare all four cores (assuming the previous patch is applied.) One weirdness I encountered: #define setopt(flag) Parrot_setflag(interpreter, flag, (*argv)[0]+2); What the heck does this do? Parrot_setflag uses its 3rd argument only as a boolean value. Where this is used, argv[0] always contains the current command-line argument. So this is equivalent to argv[0][0]+2 or in the example of "-p", that would be the character '-' + 2. Now, to make that do something, you'd need the first character of the option to be -2, and that's some weird hi-bit character. Huh? Index: include/parrot/interpreter.h === RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v retrieving revision 1.40 diff -u -r1.40 interpreter.h --- include/parrot/interpreter.h3 Apr 2002 04:01:41 - 1.40 +++ include/parrot/interpreter.h24 Apr 2002 03:58:02 - @@ -23,7 +23,8 @@ PARROT_BOUNDS_FLAG = 0x04, /* We're tracking byte code bounds */ PARROT_PROFILE_FLAG = 0x08, /* We're gathering profile information */ PARROT_PREDEREF_FLAG = 0x10, /* We're using the prederef runops */ -PARROT_JIT_FLAG = 0x20 /* We're using the jit runops */ +PARROT_JIT_FLAG = 0x20, /* We're using the jit runops */ +PARROT_CGOTO_FLAG= 0x40 /* We're using the computed goto runops */ } Interp_flags; #define Interp_flags_SET(interp, flag) (/*@i1@*/ (interp)->flags |= (flag)) Index: include/parrot/runops_cores.h === RCS file: /home/perlcvs/parrot/include/parrot/runops_cores.h,v retrieving revision 1.4 diff -u -r1.4 runops_cores.h --- include/parrot/runops_cores.h 4 Mar 2002 03:17:21 - 1.4 +++ include/parrot/runops_cores.h 24 Apr 2002 03:58:03 - @@ -20,6 +20,8 @@ opcode_t *runops_fast_core(struct Parrot_Interp *, opcode_t *); +opcode_t *runops_cgoto_core(struct Parrot_Interp *, opcode_t *); + opcode_t *runops_slow_core(struct Parrot_Interp *, opcode_t *); #endif Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.84 diff -u -r1.84 interpreter.c --- interpreter.c 15 Apr 2002 18:05:18 - 1.84 +++ interpreter.c 24 Apr 2002 03:58:04 - @@ -420,7 +425,12 @@ which |= (Interp_flags_TEST(interpreter, PARROT_PROFILE_FLAG)) ? 0x02 : 0x00; which |= (Interp_flags_TEST(interpreter, PARROT_TRACE_FLAG)) ? 0x04 : 0x00; -core = which ? runops_slow_core : runops_fast_core; +if (which) +core = runops_slow_core; +else if (Interp_flags_TEST(interpreter, PARROT_CGOTO_FLAG)) +core = runops_cgoto_core; +else +core = runops_fast_core; if (Interp_flags_TEST(interpreter, PARROT_PROFILE_FLAG)) { unsigned int i; Index: runops_cores.c === RCS file: /home/perlcvs/parrot/runops_cores.c,v retrieving revision 1.17 diff -u -r1.17 runops_cores.c --- runops_cores.c 17 Apr 2002 03:50:25 - 1.17 +++ runops_cores.c 24 Apr 2002 03:58:04 - @@ -30,12 +30,29 @@ opcode_t * runops_fast_core(struct Parrot_Interp *interpreter, opcode_t *pc) { -#ifdef HAVE_COMPUTED_GOTO -pc = cg_core(pc, interpreter); -#else while (pc) { DO_OP(pc, interpreter); } +return pc; +} + +/*=for api interpreter runops_cgoto_core + * run parrot operations until the program is complete, using the computed + * goto core (if available). + * + * No bounds checking. + * No profiling. + * No tracing. + */ + +opcode_t * +runops_cgoto_core(struct Parrot_Interp *interpreter, opcode_t *pc) +{ +#ifdef HAVE_COMPUTED_GOTO +pc = cg_core(pc, interpreter); +#else +fprintf(stderr, "Computed goto unavailable in this configuration.\n"); +exit(1); #endif return pc; } Index: test_main.c === RCS file: /home/perlcvs/parrot/test_main.c,v retrieving revision 1.50 diff -u -r1.50 test_main.c --- test_main.c 26 Mar 2002 16:33:01 - 1.50 +++ test_main.c 24 Apr 2002 04:01:25 - @@ -14,6 +14,7 @@ #include #define setopt(flag) Parrot_setflag(interpreter, flag, (*argv)[0]+2); +#define unsetopt(flag) Parrot_setflag(interpreter, flag, 0) char *parseflags(Parrot interpreter, int *argc, char **argv[]); @@ -62,6 +63,10 @@ (*argc)--; (*argv)++; +#ifdef HAVE_COMPUTED_GOTO +setopt(PARROT_CGOTO_FLAG); +#endif + while ((*argc) && (*argv)[0][0] == '-') { switch ((*argv)[0][1]) { case 'b': @@ -76,6 +81
Using closures for regex control
Larry said: > I haven't decided yet whether matches embedded in > [a regex embedded] closure should automatically pick > up where the outer match is, or whether there should > be some explicit match op to mean that, much like \G > only better. I'm thinking when the current topic is a > match state, we automatically continue where we left > off, and require explicit =~ to start an unrelated match. So, this might DWIM: # match pat1 _ pat2 _ pat3 and capture pat2 match: / pat1 { ($foo) = / pat2 / } pat3 / What is the meaning of a string returned by some code inside a regex? Would this DWIM: # match pat1 _ 'foo bar' _ pat2: / pat1 # white space is ignored { return 'foo bar' } # conserve whitespace pat2 / What if there were methods on the match state to achieve regex extensions: s/ { .<; /c/ } ei / ie /; # wierd look behind? and so on: / pat1 { .>; /pat2/ } pat3 / / { .! and .<; /pat1/ } pat2 } / -- ralph
Re: Regex and Matched Delimiters
> : I'd expect . to match newlines by default. For a . that > : didn't match newlines, I'd expect to need to use [^\n]. > > But . has never matched newlines by default, not even in grep. Perhaps. But: First, I would have thought you *can't* make . match newlines in grep, period. If so, then when perl is handling a multi-line string, it is handling a case grep never encounters. Second, I think the perl 5 default is the wrong one from the point of view of a typical newbie's guess. Third, I was thinking that having perl 6 regexen have /s on by default would be easy for perl 5 coders to understand; not too hard to get used to; and have no negative effects for existing coders beyond getting used to the change. -- ralph
Re: Regex and Matched Delimiters
> > : I'd expect . to match newlines by default. I forgot, fourth, this simplifies the rule for . -- it would become period matches any char, period. Fifth, it makes the writing of "match anything but newline" into an explicit [^\n], which I consider a good thing. Of course, all this is minor stuff. But I can't get my head around parse trees and grammars, so I'll continue to fiddle around spraying a bit of grafitti here and there on the bikeshed. -- ralph
Re: Regex and Matched Delimiters
On Tue, Apr 23, 2002 at 11:11:28PM -0500, Me wrote: > Third, I was thinking that having perl 6 regexen have /s on > by default would be easy for perl 5 coders to understand; > not too hard to get used to; and have no negative effects > for existing coders beyond getting used to the change. I'm jumping in the middle of a conversation here, but consider the problem of .* matching newlines by default and greediness. /(foo.*)$/, /(foo.*)$/m and /(foo.*)$/s when matching against something like "foo\nwiffle\nbarfoo\n" One matches the last line. One matches the first line. And one matches all three lines. -- Michael G. Schwern <[EMAIL PROTECTED]>http://www.pobox.com/~schwern/ Perl Quality Assurance <[EMAIL PROTECTED]> Kwalitee Is Job One Consistency? I'm sorry, Sir, but you obviously chose the wrong door. -- Jarkko Hietaniemi in <[EMAIL PROTECTED]>
Re: Regex and Matched Delimiters
> when matching against something like "foo\nwiffle\nbarfoo\n" >/(foo.*)$/ # matches the last line /(foo[^\n]*)$/ # assuming perl 6 meaning of $, end of string >/(foo.*)$/m # matches the first line /(foo[^\n]*)$$/ # assuming perl 6 meaning of $$, end of line or /(foo.*?)$$/ >/(foo.*)$/s # matches all three lines /(foo.*)$/ -- ralph
[CONFIGURE] New make.pl coming soon...
In between attempting to get the new assembler up and running (currently dealing with XS issues), Robert Spiers and I have come up with a new make mechanism. The syntax may change, and the build mechanism has a ways to go (It's simply running one step at a time in order, no parallelism or multiple processes), but the basic idea so far seems sound. Our replacement for the somewhat non-portable make mechanism relies entirely on perl, and so far, no external modules are required. The current version makes allowances for Win32 issues, but has not been tested on Win32 yet. Make.pl is a simple perl script that builds a dependency graph and satisfies a single target recursively. The Makefile it seeks to emulate follows: --cut here-- foo.o: foo.c cc -c foo.c bar.o: bar.c cc -c bar.c foo: foo.o bar.o cc -o foo foo.o bar.o --cut here-- This, of course, builds the binary 'foo' from the source files 'foo.c' and 'bar.c'. So far, with the exception of the need to explicitly declare a target, it's a one-to-one translation of the makefile. The perl translation of the Makefile above follows (with comments): --cut here-- # Create a compile target for 'foo.o'. "Compile targets" are actually objects # that can be queried to determine if they've been completed or not, and asked # how they want to be built. # Although the compile target is named 'CC' this, of course, doesn't mean that # 'cc' is used. The compiler name and arguments are determined based on the # current platform. my $foo_o = CC( # The 'Object()' syntax here simply takes into account the fact that # platforms such as Win32 require different extensions than UNIX. # The target is currently declared explicitly, but given the fact that we # already have the input file name, it would be easy to determine the # output file name should we want to declare 'output' explicitly. output => Object( input => 'foo' ), # The input is given an explicit file extension to accommodate for the fact # that the user may want to submit a .C file or .ch file to the compiler. # Should we not need that flexibility, the extension can be removed. input => 'foo.c', # foo.o: foo.c # Dependencies are either file names or objects, and as we'll see later on # can be arrays of these as well. dependsOn => 'foo.c', # cc -c foo.c ); my $bar_o = CC( output => Object( input => 'bar' ), input => 'bar.c', # bar.o: bar.c dependsOn => 'bar.c', # cc -c bar.c ); # The link statement is also platform-sensitive, and introduces another # platform-dependent directive, 'Executable'. This, of course, accounts for # the difference between Win32's '.exe' and UNIX's '' extension for file # names. In due course, the file name here should be stated without extension # of any kind. my $foo_exe = Link( output => Executable( input => 'foo' ), # Any directive can take an anonymous array of inputs, and they'll be # handled in the order declared. The Object declaration returns the same # name every time, so we could concievably cache these in scalars as well, # but to be pedantic I'm declaring them each time. input => [ Object(input=>'foo'), Object(input=>'bar') ], # foo: foo.o bar.o # And this can be dependent upon one or more files as well, # Take special note here that we're depending upon two *objects*, not files. # When time comes to determine if the link target needs to be done, the # script looks through these dependencies to see if 'foo.o' needs to be # rebuilt as well as comparing timestamps with 'foo.exe' to see if the link # needs to be rebuilt. dependsOn => [$foo_o, $bar_o], # cc -o foo foo.o bar.o ); # This is simply a directive that explicitly declares that 'foo' is a target. # Here, 'foo' is not an executable but a target name, that will be looked up # when 'make.pl foo' is run. $depends->{foo} = Target( input => 'foo', dependsOn => $foo_exe, ); --cut here-- This project is currently very much in an alpha state, but on my system it does handle the very limited set of dependencies you see here very well. Deleting files and rerunning make.pl does the right thing, as well as touch'ing files. I haven't expanded the makefile beyond what you see here yet, although once I'm certain of the foundation the number of available directives will grow, as will such incidentals as documentation and more error detection. Unlike the UNIX make tool, this application executes the required actions in linear order in a single process, letting each action complete before the next one is allowed to begin. In other words, no parallelism and no determining of which actions can be allowed to proceed in parallel without worries of race conditions. I'll commit this later in the week once I make two major alterations to the structure. The code as it stands builds the list of actions to be taken in parallel with checking the graph to determine whether a particular action needs to be taken or not. T