Re: [svn:perl6-synopsis] r14432 - doc/trunk/design/syn
[EMAIL PROTECTED] writes: > +Placeholder names may only be lowercase, not because we're mean, but > +because it helps us catch references to obsolete Perl 5 variables such as > $^O. That seems unnecessarily restrictive. How about "may not consist solely of uppercase letters" instead? That would still permit things like $^Item, or caseless letters (Han characters, Japanese kana, Hangul, Devanagari, Thai, Hebrew, Arabic, etc). Maybe even "may not consist solely of uppercase Latin-script letters"; that would permit uppercase Greek and Cyrillic and so on. -- Aaron Crane
[svn:perl6-synopsis] r14434 - doc/trunk/design/syn
Author: larry Date: Sat Aug 4 09:06:15 2007 New Revision: 14434 Modified: doc/trunk/design/syn/S06.pod Log: relaxed restriction on placeholders as suggested by Aaron Crane++ Modified: doc/trunk/design/syn/S06.pod == --- doc/trunk/design/syn/S06.pod(original) +++ doc/trunk/design/syn/S06.podSat Aug 4 09:06:15 2007 @@ -13,7 +13,7 @@ Maintainer: Larry Wall <[EMAIL PROTECTED]> Date: 21 Mar 2003 - Last Modified: 3 Aug 2007 + Last Modified: 4 Aug 2007 Number: 6 Version: 89 @@ -1397,8 +1397,9 @@ Note that placeholder variables syntactically cannot have type constraints. Also, it is illegal to use placeholder variables in a block that already has a signature, because the autogenerated signature would conflict with that. -Placeholder names may only be lowercase, not because we're mean, but -because it helps us catch references to obsolete Perl 5 variables such as $^O. +Placeholder names consisting of a single uppercase letter are disallowed, +not because we're mean, but because it helps us catch references to +obsolete Perl 5 variables such as $^O. =head1 Properties and traits
Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn
On Thu, Aug 02, 2007 at 04:19:18PM -0700, [EMAIL PROTECTED] wrote: > Increment of a C (in a suitable container) works similarly to > Perl 5, but is generalized slightly. First, the string is examined > to see if it could be the string representation of a number in > any common representation, including floating point and radix > notation. (Surrounding whitespace is also allowed around such a > number.) If it appears to be a number, it is converted to a number > and incremented as a number. Just for verification: an increment of "0xff" will therefore result in 256 and not "0xfg". Correct? > final alphanumeric sequence in the string. Unlike in Perl 5, this > alphanumeric sequence need not be anchored to the beginning of the > string, nor does it need to begin with an alphabetic character; the > final sequence in the string matching C<\w+> is incremented regardless > of what comes before it. ...does the \w+ include non-ASCII alphanumerics and underscore? Or should the spec limit itself to [A-Za-z0-9]+ here? If we include non-ASCII alphanumerics, then incrementing something like "résumé" produces "résumf" ? Pm
Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn
On Sat, Aug 04, 2007 at 12:55:58PM -0500, Patrick R. Michaud wrote: : On Thu, Aug 02, 2007 at 04:19:18PM -0700, [EMAIL PROTECTED] wrote: : > Increment of a C (in a suitable container) works similarly to : > Perl 5, but is generalized slightly. First, the string is examined : > to see if it could be the string representation of a number in : > any common representation, including floating point and radix : > notation. (Surrounding whitespace is also allowed around such a : > number.) If it appears to be a number, it is converted to a number : > and incremented as a number. : : Just for verification: an increment of "0xff" will therefore : result in 256 and not "0xfg". Correct? Correct. Likewise ":16". I'm only wondering whether we should also include complex number representations here. :) I suppose one could argue that "0xff" should increment to "0x100"... : > final alphanumeric sequence in the string. Unlike in Perl 5, this : > alphanumeric sequence need not be anchored to the beginning of the : > string, nor does it need to begin with an alphabetic character; the : > final sequence in the string matching C<\w+> is incremented regardless : > of what comes before it. : : ...does the \w+ include non-ASCII alphanumerics and underscore? : Or should the spec limit itself to [A-Za-z0-9]+ here? If we : include non-ASCII alphanumerics, then incrementing something like : "résumé" produces "résumf" ? Hmm, good point. Could probably limit alphas to ASCII if we wanted to be culturally insensitive, though we could easily include all the contiguous Unicode digit ranges that go from 0 to 9. Which, oddly, doesn't include the numeric dingbats, which tend to start at 1, and if there's a corresponding 0, it's not the codepoint before the 1. I can see an argument for allowing such characters to increment though: for '❶' .. '❿' { .say } But it's not clear what to do if you try to increment ❿ though. Probably just return a failure. Or we could stick with \w+, which makes sense for various alphabets like Greek and Hebrew, just let "résumé" turn into, not "résumf", but rather "résumê", since the decrement should be the reverse of the increment. Except it's not really right for Greek, since the basic letters run into other precomposed letters after omega. Basically we'd need to identify all wrappable alphabet ranges, which probably leaves out all accented character, which means that "résumé" would turn into "résuné" presumably. Which basically means we'd need to define our own character class for wrappable alphanumerics. Possibly we could define it algorithmically based on current Unicode data, but that would tend to include the entire CJK area as one alphabet, which is not going to make much sense to anyone, especially since most legacy Asian fonts don't provide all the characters. For now I'm just going to hardcode the ranges in the spec, I think. We'll also maybe have to hardcode which ranges wrap and which ones don't, if we want to allow incrementing numeric dingbats. Which I think would be way cool, actually: for @points Z '⒈' .. * -> $p, $n { say "$n\t$p" } or for roman numerals, since the numeral is distinguished as a separate character from the latin letter: for @points Z 'ⅰ' .. * -> $p, $n { say "$n.\t$p" } A sufficiently motivated person could make roman numerals work right up to the limits of the notation, assuming we allow varying numbers of characters. But then it's not clear whether ⅹ should increment to ⅺ or to ⅹⅰ. (And yes, those are different.) Of course, we could also treat ⅲ as a precomposed ⅰⅰⅰ. Not sure which way that argues; it'd be kinda strange to use precomposed forms just for this one purpose. Also, there are only precomposed characters in the digits range; there's no precomposed ⅽⅹⅹⅹ form, for instance. They probably did procomposed up to twelve just for clocks, and maybe because ⅰⅰⅰ looks too spread out in a monospace font. If we make roman numerals increment, then I think that also argues for making "0xff" stay a string too. Basically a "0x" on the front would pick 0..9a..f as the "alphabetic" range for the rest of it. Arguably this could all be handled by a function that takes a random string and converts it to a typed string with the appropriate .succ and .pred methods. Maybe an appropriate set of multis would be most extensible. Or a multi-token: multi token numrange:<0b> (--> StrBinary) { '0b' <[0..1]>+ } multi token numrange:<0o> (--> StrOctal) { '0o' <[0..7]>+ } multi token numrange:<0d> (--> StrDec){ '0d' <[0..9]>+ } multi token numrange:<0x> (--> StrHex){ '0x' <[0..9a..fA..F]>+ } multi token numrange:roman (--> StrRoman) { <[ Ⅰ .. ↂ ]> } etc. Maybe these are all just mixins of various Incremental roles. That's probably more than enough speculation for now... Larry
Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn
On Sat, Aug 04, 2007 at 12:56:06PM -0700, Larry Wall wrote: : multi token numrange:<0x> (--> StrHex){ '0x' <[0..9a..fA..F]>+ } Though sanity would probably force us to use numerics internally anyway as the canonical comparison form, or we'd have trouble getting '0x00' .. '0x0Ff' to terminate. That probably goes double for roman numerals. Larry
Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn
On Sat, Aug 04, 2007 at 12:56:06PM -0700, Larry Wall wrote: > for '❶' .. '❿' { .say } > > But it's not clear what to do if you try to increment ❿ though. > Probably just return a failure. Assuming that '❶' .. '❿' is a range similar to '0'..'9', then consistency with the other ranges would seem to indicate that incrementing 'a❿' produces 'b❶', and incrementing '❿' on its own would produce '❶❶'. (Unless, of course, '❿' is treated as a "number in any common representation", in which case incrementing it produces 11.) I'm not saying that anything involving the dingbats makes good sense -- just that this is what I would tend to expect to happen based on how the other ranges autoincrement. Feel free to insert pithy quotes about consistency and hobgoblins here. :-) And so we don't get bogged down in (relatively unimportant) details, I'll refrain from shouting "look at the ugly corner cases!" for now and leave it to others to decide how/when to push this. The changes to S03 and clarifications give me enough to proceed for now -- namely: Strings that look like numbers or that don't end in \w+ are numified and then incremented, whereas strings ending with \w+ are incremented according to individual character ranges. The exact set of ranges are still under discussion, but the ranges A-Z, a-z, and 0-9 have the "expected" semantics. Others can continue on the discussion if wanted, but as an implementor I'm happy with this outcome for now. :-) Thanks! Pm