Re: slaughter of the LTM metatokens
On Thu, Jan 29, 2009 at 10:58:18PM -0800, Mark Lentczner wrote: >> [STD, S03] slaughter of the LTM metatokens > > This cleans up the metaop scene quite a bit. Bravo! > > I went through STD.pm with a fine tooth comb again, to extract what I'd > say about which operators were allowed to be meta'd by each given > metaop: > > (The notation "foo --> bar" means, takes an operator of type foo and > makes an operator of type bar) > > op= infix --> infix > op can't be :assoc > op must be :assign > > ** The first test is excessive: There are no non-associative operators > that have the :assign property. Hence, testing for :assign should be > enough. Besides, I'm not sure what about this metaop should require > non-associativity from its internal operator. The purpose of the 'non' test is to give the user a better error message than "You can't do that". The reason that non-associatives don't work for assignment ops is that non-associatives typically return a different type than they take as arguments, and so an assignment op based on a non-associative is going to be changing the type of the target strangely, which seems more likely to be a braino than intended. > !op infix --> infix > op must not start with '!' > op must return 'bool', these are the chaining ops and % > or op can be '=' > > ** Declaring that infix:sym<%> is :returns just so this works > seems a bit ugly to me. Why not simply define infix:sym? Well, we may open that up a bit more, as discussed on irc, and not tie it to official Bool return in any case, since in Perl anything may be used as a boolean. The problem with defining an override including the metaop is that, 1), you don't get the automatic benefit of whatever autogen the metaop invokes, and 2), it doesn't work right if you have, say, a user-defined %% operator. > ** Why is the test for .text eq '=' in there? infix:sym is already > defined. Yes, but the != rule currently requires whitespace after it, which is suboptimal. If we don't require the whitespace, then it misparses !== and !===, because the LTM will call != before it calls !. So most of these special forms need to move into the metaoperator and check the result after trying for a compile infix, as the '=' is doing. I had to work around problems with !~ and +< and ~< as well, which caused ambiguity with !~~, >>+<<, and >>~<<. There's a penalty for not running all metaops through LTM, which is that all the strange forms need to be recognized by the metaop itself. Haven't quite got that all straight yet in STD. > Rop infix --> infix > op must not start with '=' > > ** The restriction seems unneeded and has odd side-effects. It is true > that there is no reason to all R== R=:= or R===, since they are the same > as their non-reversed selves. But the restriction means that: > Rp5=> is an operator, but R=> is not > R:= is an operator, but R= is not > In the first case, I'd imagine I'd want them both to be. In the second, > I imagine I'd want neither -- but not sure I care all that much. Ah, that was a fossil from when we were disallowing -= and - was allowed only on comparisons. It makes no sense since - change to R and was generalized to argument reversal. > [op] infix --> prefix > op can't be :assoc('non') > op can't have same precedence as %conditional > > ** The second test should probably be can't be :assoc('chain') > ** Really, the restriction on the operator is that it has to return a > type that is compatible with being on its left side for a given type on > its right. There are things that are currently allowed here, like [:=] > or [=>] that may not make much sense. Perhaps there should be :reduce > like there is :assign to indicate which are Or the veto forms of that, as we discussed on irc. Anyway, not really trying to restrict the freedom of people to shoot themselves in the foot so much as just hoping to be able to warn them why their foot is about to go missing with a reasonable error message when something is obviously misthought. And I obviously ought to go to bed before I write much more stream-of-unconsciousness prose... Larry
Re: r25102 - docs/Perl6/Spec
Mark (>), Moritz (>>), Larry via commit bot (>>>): >>> +PERL# Lexical symbols in the standard "perlude" >> >> Did you mean "prelude" instead? > > I took the quotation marks to indicate an intentional > misspelling/coinage: "perl" + "prelude" = "perlude". At which point one might ask oneself whether it is more important that the synopses be amusing and punny, or that they clearly specify what is expected of a conforming Perl 6 implementation. Now, just so you don't think I'm all cranky and humour-impaired: I got the pun, I smiled a bit at it -- but I already know what a "standard prelude" is. Those who don't are going to be confused in two ways when they read the above, making the explanatory comment essentially useless. // Carl
Re: r25122 - docs/Perl6/Spec
pugs-comm...@feather.perl6.nl wrote: In the abstract, Perl is written in Unicode, and has consistent Unicode -semantics regardless of the underlying text representations. +semantics regardless of the underlying text representations. By default +Perl presents Unicode in "NFG" formation, where each grapheme counts as +one character. A grapheme is what the novice user would think of as a +character in their normal everyday life, including any diacritics. What's with this NFG / Normal Form G that you refer to? I don't see any mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC? For that matter, is it possible for all realistic combinations of diacritics and base letters to be represented by a single Unicode codepoint, including all language-dependent graphemes? I thought NFC sort of did one codepoint per grapheme but there were a few exceptions ... I could be wrong on that point. -- Darren Duncan
Re: r25122 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 6:30 AM, Darren Duncan wrote: > pugs-comm...@feather.perl6.nl wrote: >> >> By default Perl presents Unicode in "NFG" formation, where each grapheme >> counts as >> one character. A grapheme is what the novice user would think of as a >> character in their normal everyday life, including any diacritics. > > What's with this NFG / Normal Form G that you refer to? I don't see any > mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC? As far as I can tell, NFG isn't an official Unicode Normalization Format; it's a HLL thing, and it has nothing to do with code points. When you ask Perl6 for one "character", what you get back (by default) is one "grapheme" - presumably as defined by UAX #29 - which may be one or more code points, and who knows how many bytes it winds up encoded as in memory. Applescript 2.0 takes this approach as well. So are there any non-opaque, non-string grapheme representations? Does ord() work on them? In AS, the equivalent function is allowed to return a list of numbers instead of just a single number; in either case, the value can be passed to the chr() equivalent to get the same grapheme back. > For that matter, is it possible for all realistic combinations of diacritics > and base letters to be represented by a single Unicode codepoint, including > all language-dependent graphemes? Absolutely not. Again, nobody said anything about "code points". We're talking about Perl6's idea of "characters". -- Mark J. Reed
Re: r25102 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 10:49:13AM +0100, Carl Mäsak wrote: : Mark (>), Moritz (>>), Larry via commit bot (>>>): : >>> +PERL# Lexical symbols in the standard "perlude" : >> : >> Did you mean "prelude" instead? : > : > I took the quotation marks to indicate an intentional : > misspelling/coinage: "perl" + "prelude" = "perlude". : : At which point one might ask oneself whether it is more important that : the synopses be amusing and punny, or that they clearly specify what : is expected of a conforming Perl 6 implementation. : : Now, just so you don't think I'm all cranky and humour-impaired: I got : the pun, I smiled a bit at it -- but I already know what a "standard : prelude" is. Those who don't are going to be confused in two ways when : they read the above, making the explanatory comment essentially : useless. You must understand that part of the reason I wrote that is to remind folks that we're *not* talking about a standard prelude here. The prelude metaphor says that it's something that comes before your program, but that's not what we want. We want something that comes outside your program, that is, a lexical scope that *surrounds* the file scope. We don't have a good word for that: circumlude? ambilude? So that's why I said "perlude". Well, that, and it was a pun. :) The concept here is that any lexical scope can parse a token that says "snapshot me here at this depth", and then there's a mechanism for inserting the new main program in that lexical scope at startup. It not only gives us the standard outerlude, but allows us to start up the parser in any language we care to specify by snapshot name. Special cases might even have their own switches, which is why S19 talks about implementing -n and -p by substituting a different prelude. But then it's not just a prelude, because it's supplying an implicit loop around the main code as part of the definition of the language you're using. So I'm open to suggestions for what we ought to call that envelope if we don't call it the prelude or the perlude. Locale is bad, environs is bad, context is bad...the wrapper? But we have dynamic wrappers already, so that's bad. Maybe the setting, like a jewel? That has a nice static feeling about it at least, as well as a feeling of surrounding. Or we could go with a more linguistic contextual metaphor. Argot, lingo, whatever... So anyway, just because other languages call it a prelude doesn't mean that we have to. Perl is the tail that's always trying to wag the dog... What is the sound of one tail wagging? Larry
Re: r25102 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 11:30 AM, Larry Wall wrote: > We want something that comes > outside your program, that is, a lexical scope that *surrounds* the > file scope. We don't have a good word for that: circumlude? ambilude? >[...] > Or we could go with a more linguistic contextual metaphor. Argot, > lingo, whatever... If we're being all linguistical, how about "circumlect"? -- Mark J. Reed
Re: r25102 - docs/Perl6/Spec
Larry Wall wrote: > So I'm open to suggestions for what we ought to call that envelope > if we don't call it the prelude or the perlude. Locale is bad, > environs is bad, context is bad...the wrapper? But we have dynamic > wrappers already, so that's bad. Maybe the setting, like a jewel? > That has a nice static feeling about it at least, as well as a feeling > of surrounding. > > Or we could go with a more linguistic contextual metaphor. Argot, > lingo, whatever... > > So anyway, just because other languages call it a prelude doesn't > mean that we have to. Perl is the tail that's always trying to > wag the dog... > > What is the sound of one tail wagging? whoosh, whoosh. I tend to like "setting", because it makes me think of the setting of a play, in which the actors (i.e., objects) perform their assigned roles in following the script. -- Jonathan "Dataweaver" Lang
Re: r25102 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 08:30:25AM -0800, Larry Wall wrote: > So anyway, just because other languages call it a prelude doesn't > mean that we have to. Perl is the tail that's always trying to > wag the dog... > > What is the sound of one tail wagging? For my dog Sally, the sound of one tail wagging is regularly used to indicate that she believes I'm in desperate need of taking her on a walk. Pm
Re: r25122 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 03:30:02AM -0800, Darren Duncan wrote: > pugs-comm...@feather.perl6.nl wrote: >> In the abstract, Perl is written in Unicode, and has consistent Unicode >> -semantics regardless of the underlying text representations. >> +semantics regardless of the underlying text representations. By default >> +Perl presents Unicode in "NFG" formation, where each grapheme counts as >> +one character. A grapheme is what the novice user would think of as a >> +character in their normal everyday life, including any diacritics. > > What's with this NFG / Normal Form G that you refer to? I don't see any > mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC? Nope, this is a Perl/Parrot idea. It started out with a notion of mine a year ago. Search for 'grapheme' in http://use.perl.org/~chromatic/journal/35461 We named it NFG about the time Simon Cozens wrote a PDD for it for parrot. At the moment it's much better specced in Parrotland than in P6land. See http://www.parrotcode.org/docs/pdd/pdd28_strings.html NFG stands for Normalization Form G, where the G is short for "grapheme". And before anyone asks, yes, we were aware of the other gloss for NFG when we picked it. :) > For that matter, is it possible for all realistic combinations of > diacritics and base letters to be represented by a single Unicode > codepoint, including all language-dependent graphemes? No, that is the vision of NFC, but there are potentially an infinite number of graphemes that can be composed in Unicode. NFG aims to represent each of those locally as a single integer, and translate back out to a more standard normalization form on output. > I thought NFC sort of did one codepoint per grapheme but there were a few > exceptions ... I could be wrong on that point. You are correct, NFC doesn't do all that we want. By the way, we could use someone to write the Perl 6 Unicode synopsis, based on PDD 28. Larry
Re: r25122 - docs/Perl6/Spec
On Fri, 2009-01-30 at 08:12 +0100, pugs-comm...@feather.perl6.nl wrote: > @@ -103,7 +106,7 @@ > =item * > > POD sections may be used reliably as multiline comments in Perl 6. > -Unlike in Perl 5, POD syntax now requires that C<=begin comment> > +Unlike in Perl 5, POD syntax now lets you use C<=begin comment> > and C<=end comment> delimit a POD block correctly without the need > for C<=cut>. (In fact, C<=cut> is now gone.) The format name does > not have to be C -- any unrecognized format name will do I believe that with this change in wording the next line needs to use 'to delimit' rather than just 'delimit'. -'f
Re: r25122 - docs/Perl6/Spec
On Fri, Jan 30, 2009 at 10:28:43AM -0800, Geoffrey Broadwell wrote: : On Fri, 2009-01-30 at 08:12 +0100, pugs-comm...@feather.perl6.nl wrote: : > @@ -103,7 +106,7 @@ : > =item * : > : > POD sections may be used reliably as multiline comments in Perl 6. : > -Unlike in Perl 5, POD syntax now requires that C<=begin comment> : > +Unlike in Perl 5, POD syntax now lets you use C<=begin comment> : > and C<=end comment> delimit a POD block correctly without the need : > for C<=cut>. (In fact, C<=cut> is now gone.) The format name does : > not have to be C -- any unrecognized format name will do : : I believe that with this change in wording the next line needs to use : 'to delimit' rather than just 'delimit'. You've got a commit bit, I believe. :) Larry
Re: r25122 - docs/Perl6/Spec
Larry Wall wrote: On Fri, Jan 30, 2009 at 03:30:02AM -0800, Darren Duncan wrote: What's with this NFG / Normal Form G that you refer to? I don't see any mention of that in http://unicode.org/reports/tr15/ ... did you mean NFC? Nope, this is a Perl/Parrot idea. It started out with a notion of mine a year ago. Search for 'grapheme' in http://use.perl.org/~chromatic/journal/35461 We named it NFG about the time Simon Cozens wrote a PDD for it for parrot. At the moment it's much better specced in Parrotland than in P6land. See http://www.parrotcode.org/docs/pdd/pdd28_strings.html Okay, I understand now. NFG is designed just as a temporary in-process normal form where the same representation of a character as a number can't reliably be consistent over the long term, unlike NFC/D/KC/KD/etc. It does occur to me, though, that as long as we include the generated lookup table (not required for NFC/etc), NFG can be serialized as is and be unambiguously understood by NFG-savvy programs over the long term. Much how LZW (name?) compression works, that includes its own lookup table. So as long as this nature of NFG is understood, and if necessary any serialized forms will include a spec version num / etc as protection in the face of upgrades, this could also stand to be a standard beyond Perl/Parrot/etc. I wonder if the Unicode consortium would be interested in adopting an NFG-alike, or whether that would be beyond their scope? By the way, we could use someone to write the Perl 6 Unicode synopsis, based on PDD 28. Well, if someone else doesn't do it first, I don't think it would be too difficult for me to do this, at least the initial based-on-PDD-28 cut; however it would likely be a few weeks before I get around to it, partly since I don't have a Pugs repo checkout in place ... maybe when I port the new Set::Relation to Perl 6, requiring such a checkout, I may do that too ... but don't wait for me. By the way, in the mean-time, someone should update that reference to NFG in S02 to include a link to that PDD28, so other people encountering it don't have to ask the same question I did. -- Darren Duncan