Re: [svn:perl6-synopsis] r14432 - doc/trunk/design/syn

2007-08-04 Thread Aaron Crane
[EMAIL PROTECTED] writes:
> +Placeholder names may only be lowercase, not because we're mean, but
> +because it helps us catch references to obsolete Perl 5 variables such as 
> $^O.

That seems unnecessarily restrictive.  How about "may not consist
solely of uppercase letters" instead?  That would still permit things
like $^Item, or caseless letters (Han characters, Japanese kana,
Hangul, Devanagari, Thai, Hebrew, Arabic, etc).

Maybe even "may not consist solely of uppercase Latin-script letters";
that would permit uppercase Greek and Cyrillic and so on.

-- 
Aaron Crane


[svn:perl6-synopsis] r14434 - doc/trunk/design/syn

2007-08-04 Thread larry
Author: larry
Date: Sat Aug  4 09:06:15 2007
New Revision: 14434

Modified:
   doc/trunk/design/syn/S06.pod

Log:
relaxed restriction on placeholders as suggested by Aaron Crane++


Modified: doc/trunk/design/syn/S06.pod
==
--- doc/trunk/design/syn/S06.pod(original)
+++ doc/trunk/design/syn/S06.podSat Aug  4 09:06:15 2007
@@ -13,7 +13,7 @@
 
   Maintainer: Larry Wall <[EMAIL PROTECTED]>
   Date: 21 Mar 2003
-  Last Modified: 3 Aug 2007
+  Last Modified: 4 Aug 2007
   Number: 6
   Version: 89
 
@@ -1397,8 +1397,9 @@
 Note that placeholder variables syntactically cannot have type constraints.
 Also, it is illegal to use placeholder variables in a block that already
 has a signature, because the autogenerated signature would conflict with that.
-Placeholder names may only be lowercase, not because we're mean, but
-because it helps us catch references to obsolete Perl 5 variables such as $^O.
+Placeholder names consisting of a single uppercase letter are disallowed,
+not because we're mean, but because it helps us catch references to
+obsolete Perl 5 variables such as $^O.
 
 =head1 Properties and traits
 


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Patrick R. Michaud
On Thu, Aug 02, 2007 at 04:19:18PM -0700, [EMAIL PROTECTED] wrote:
>  Increment of a C (in a suitable container) works similarly to
>  Perl 5, but is generalized slightly.  First, the string is examined
>  to see if it could be the string representation of a number in
>  any common representation, including floating point and radix
>  notation. (Surrounding whitespace is also allowed around such a
>  number.)  If it appears to be a number, it is converted to a number
>  and incremented as a number.  

Just for verification:  an increment of "0xff" will therefore
result in 256 and not "0xfg".  Correct?

>  final alphanumeric sequence in the string.  Unlike in Perl 5, this
>  alphanumeric sequence need not be anchored to the beginning of the
>  string, nor does it need to begin with an alphabetic character; the
>  final sequence in the string matching C<\w+> is incremented regardless
>  of what comes before it.  

...does the \w+ include non-ASCII alphanumerics and underscore?  
Or should the spec limit itself to [A-Za-z0-9]+ here?  If we
include non-ASCII alphanumerics, then incrementing something like
"résumé" produces "résumf" ?

Pm


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Larry Wall
On Sat, Aug 04, 2007 at 12:55:58PM -0500, Patrick R. Michaud wrote:
: On Thu, Aug 02, 2007 at 04:19:18PM -0700, [EMAIL PROTECTED] wrote:
: >  Increment of a C (in a suitable container) works similarly to
: >  Perl 5, but is generalized slightly.  First, the string is examined
: >  to see if it could be the string representation of a number in
: >  any common representation, including floating point and radix
: >  notation. (Surrounding whitespace is also allowed around such a
: >  number.)  If it appears to be a number, it is converted to a number
: >  and incremented as a number.  
: 
: Just for verification:  an increment of "0xff" will therefore
: result in 256 and not "0xfg".  Correct?

Correct.  Likewise ":16".  I'm only wondering whether we should
also include complex number representations here.  :)

I suppose one could argue that "0xff" should increment to "0x100"...

: >  final alphanumeric sequence in the string.  Unlike in Perl 5, this
: >  alphanumeric sequence need not be anchored to the beginning of the
: >  string, nor does it need to begin with an alphabetic character; the
: >  final sequence in the string matching C<\w+> is incremented regardless
: >  of what comes before it.  
: 
: ...does the \w+ include non-ASCII alphanumerics and underscore?  
: Or should the spec limit itself to [A-Za-z0-9]+ here?  If we
: include non-ASCII alphanumerics, then incrementing something like
: "résumé" produces "résumf" ?

Hmm, good point.  Could probably limit alphas to ASCII if we wanted
to be culturally insensitive, though we could easily include all the
contiguous Unicode digit ranges that go from 0 to 9.  Which, oddly,
doesn't include the numeric dingbats, which tend to start at 1, and if
there's a corresponding 0, it's not the codepoint before the 1.  I can
see an argument for allowing such characters to increment though:

for '❶' .. '❿' { .say }

But it's not clear what to do if you try to increment ❿ though.
Probably just return a failure.

Or we could stick with \w+, which makes sense for various alphabets
like Greek and Hebrew, just let "résumé" turn into, not "résumf",
but rather "résumê", since the decrement should be the reverse of
the increment.

Except it's not really right for Greek, since the basic letters run
into other precomposed letters after omega.  Basically we'd need to
identify all wrappable alphabet ranges, which probably leaves out
all accented character, which means that "résumé" would turn into
"résuné" presumably.  Which basically means we'd need to define our
own character class for wrappable alphanumerics.  Possibly we could
define it algorithmically based on current Unicode data, but that
would tend to include the entire CJK area as one alphabet, which is
not going to make much sense to anyone, especially since most legacy
Asian fonts don't provide all the characters.  For now I'm just going
to hardcode the ranges in the spec, I think.  We'll also maybe have
to hardcode which ranges wrap and which ones don't, if we want to
allow incrementing numeric dingbats.

Which I think would be way cool, actually:

for @points Z '⒈'  .. * -> $p, $n { say "$n\t$p" }

or for roman numerals, since the numeral is distinguished as a separate
character from the latin letter:

for @points Z 'ⅰ'  .. * -> $p, $n { say "$n.\t$p" }

A sufficiently motivated person could make roman numerals work right
up to the limits of the notation, assuming we allow varying numbers
of characters.  But then it's not clear whether ⅹ should increment
to ⅺ or to ⅹⅰ.  (And yes, those are different.)  Of course, we
could also treat ⅲ as a precomposed ⅰⅰⅰ.  Not sure which way
that argues; it'd be kinda strange to use precomposed forms just for
this one purpose.  Also, there are only precomposed characters in the
digits range; there's no precomposed ⅽⅹⅹⅹ form, for instance.
They probably did procomposed up to twelve just for clocks, and
maybe because ⅰⅰⅰ looks too spread out in a monospace font.

If we make roman numerals increment, then I think that also argues
for making "0xff" stay a string too.  Basically a "0x" on the front
would pick 0..9a..f as the "alphabetic" range for the rest of it.

Arguably this could all be handled by a function that takes a random
string and converts it to a typed string with the appropriate .succ
and .pred methods.  Maybe an appropriate set of multis would be most
extensible.  Or a multi-token:

multi token numrange:<0b>  (--> StrBinary) { '0b' <[0..1]>+ }
multi token numrange:<0o>  (--> StrOctal)  { '0o' <[0..7]>+ }
multi token numrange:<0d>  (--> StrDec){ '0d' <[0..9]>+ }
multi token numrange:<0x>  (--> StrHex){ '0x' <[0..9a..fA..F]>+ }
multi token numrange:roman (--> StrRoman)  { <[ Ⅰ .. ↂ  ]> }
etc.

Maybe these are all just mixins of various Incremental roles.

That's probably more than enough speculation for now...

Larry


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Larry Wall
On Sat, Aug 04, 2007 at 12:56:06PM -0700, Larry Wall wrote:
: multi token numrange:<0x>  (--> StrHex){ '0x' <[0..9a..fA..F]>+ }

Though sanity would probably force us to use numerics internally anyway
as the canonical comparison form, or we'd have trouble getting

'0x00' .. '0x0Ff'

to terminate.  That probably goes double for roman numerals.

Larry


Re: [svn:perl6-synopsis] r14431 - doc/trunk/design/syn

2007-08-04 Thread Patrick R. Michaud
On Sat, Aug 04, 2007 at 12:56:06PM -0700, Larry Wall wrote:
> for '❶' .. '❿' { .say }
> 
> But it's not clear what to do if you try to increment ❿ though.
> Probably just return a failure.

Assuming that '❶' .. '❿' is a range similar to '0'..'9', then
consistency with the other ranges would seem to indicate that
incrementing 'a❿'  produces 'b❶', and incrementing '❿' on its
own would produce '❶❶'.  (Unless, of course, '❿' is treated as
a "number in any common representation", in which case incrementing
it produces 11.)

I'm not saying that anything involving the dingbats makes good
sense -- just that this is what I would tend to expect to happen
based on how the other ranges autoincrement.  Feel free to insert
pithy quotes about consistency and hobgoblins here.  :-)

And so we don't get bogged down in (relatively unimportant) details, 
I'll refrain from shouting "look at the ugly corner cases!" for now 
and leave it to others to decide how/when to push this.  The changes to 
S03 and clarifications give me enough to proceed for now -- namely:

Strings that look like numbers or that don't end in \w+ are numified 
and then incremented, whereas strings ending with \w+ are incremented 
according to individual character ranges.  The exact set of ranges are 
still under discussion, but the ranges A-Z, a-z, and 0-9 have the 
"expected" semantics.  

Others can continue on the discussion if wanted, but as an implementor
I'm happy with this outcome for now.  :-)

Thanks!

Pm