Spec

pugs-commits Mon, 21 Sep 2009 13:57:33 -0700

Author: lwall
Date: 2009-09-21 22:57:15 +0200 (Mon, 21 Sep 2009)
New Revision: 28344


Modified:
   docs/Perl6/Spec/S03-operators.pod
   docs/Perl6/Spec/S09-data.pod
Log:
[S03,S09]
    Range objects are now primarily intervals in C<cmp>
    Extend dwimminess of series to handle steps and limits readably
    :by is deemed Too Ugly and is now dead, David Green++
    Use series operator to replace :by semantics more readably
    Range objects used as lists now simply mutate .. to ...
        (taking into account ^ though)
    Alpha ranges must now match endpoint using !after semantics on non-eqv
    Simplify range semantics when used as subscripts
    Kill unshifty negative subscript lvalues as too error prone
    Spec way to declare modular subscripts


Modified: docs/Perl6/Spec/S03-operators.pod
===================================================================
--- docs/Perl6/Spec/S03-operators.pod   2009-09-21 20:44:52 UTC (rev 28343)
+++ docs/Perl6/Spec/S03-operators.pod   2009-09-21 20:57:15 UTC (rev 28344)
@@ -14,8 +14,8 @@
 
     Created: 8 Mar 2004
 
-    Last Modified: 2 Sep 2009
-    Version: 172
+    Last Modified: 21 Sep 2009
+    Version: 173
 
 =head1 Overview
 
@@ -270,7 +270,7 @@
 
 Pair composers
 
-    :by(2)
+    :limit(5)
     :!verbose
 
 =item *
@@ -1413,7 +1413,7 @@
 
 Adverbs will generally attach the way you want when you say things like
 
-    1 .. $x+2 :by(2)
+    1 op $x+2 :mod($x)
 
 The proposed internal testing syntax makes use of these precedence rules:
 
@@ -1751,10 +1751,11 @@
 More typically the function is unary, in which case any extra values
 in the list may be construed as human-readable documentation:
 
-    0,2,4 ... { $_ + 2 }    # same as 1..*:by(2)
+    0,2,4 ... { $_ + 2 }    # all the evens
+    0,2,4 ... *+2           # same thing
     <a b c> ... { .succ }   # same as 'a'..*
 
-The function need not be monotonic, of course:
+The function need not be monotoniccaly increasing, of course:
 
     1 ... { -$_ }          # 1, -1, 1, -1, 1, -1...
     False ... &prefix:<!>  # False, True, False...
@@ -1763,7 +1764,7 @@
 
     () ... { rand }   # list of random numbers
 
-The function may also be slurpy (*-ary), in which case all the
+The function may also be slurpy (n-ary), in which case all the
 preceding values are passed in (which means they must all be cached
 by the operator, so performance may suffer).
 
@@ -1773,26 +1774,24 @@
     1,1 ... { $^a + 1, $^b * 2 }   # 1,1,2,2,3,4,4,8,5,16,6,32...
 
 If the right operand is C<*> (Whatever) and the sequence is obviously
-arithmetic or geometric, the appropriate function is deduced:
+arithmetic or geometric (from examining its I<last> 3 values), the appropriate 
function is deduced:
 
     1, 3, 5 ... *   # odd numbers
     1, 2, 4 ... *   # powers of 2
 
-Conjecture: other such patterns may be recognized in the future,
-depending on which unrealistic benchmarks we want to run faster.  C<:)>
+If there are only two values so far, C<*> assumes an arithmentic
+progression.  If there is only one value (or if the final values do
+not support the requisite arithmetic), C<*> assumes incrementation
+via C<.succ>.  Hence these come out the same:
 
-Note: the yada operator is recognized only where a term is expected.
-This operator may only be used where an infix is expected.  If you
-put a comma before the C<...> it will be taken as a yada list operator
-expressing the desire to fail when the list reaches that point:
+    1..*
+    1...*
+    1,2,3...*
 
-    1..20, ... "I only know up to 20 so far mister"
+If list on the left is C<Nil>, C<*> will return a single C<Nil>.
 
-If the yada operator finds a closure for its argument at compile time,
-it should probably whine about the fact that it's difficult to turn
-a closure into an error message.  Alternately, we could treat
-an ellipsis as special when it follows a comma to better support
-traditional math notation.
+Conjecture: other such patterns may be recognized in the future,
+depending on which unrealistic benchmarks we want to run faster.  C<:)>
 
 The function may choose to terminate its list by returning ().
 Since this operator is list associative, an inner function may be
@@ -1809,10 +1808,83 @@
     10,20,30,40,50,60,70,80,90,
     100,200,300,400,500,600,700,800,900
 
+If the right operand is a list and the first element of the list is
+a function or C<*>, the second element of the list imposes a limit
+on the prior sequence.  (The limit is inclusive on an exact match,
+and in general is compared using C<!after> semantics, so an inexact
+match is *not* included.)  Hence the preceding example may be rewritten
+
+    1   ... * + 1, 9
+    10  ... * + 10, 90
+    100 ... * + 100, 1000
+
+or as
+
+    1, 2, 3 ... *,
+    10, 20, 30 ... *,
+    100, 200, 300 ... *, 1000
+
+In the latter case the preceding 3 elements are used to deduce
+the correct arithmetic progression, so the 3, 30, and 300
+terms are necessary.
+
+If the first element of the list is numeric, a C<*> is assumed
+before it, and the first element is again taken as the limit.
+So the preceding example reduces to:
+
+    1, 2, 3 ...
+    10, 20, 30 ...
+    100, 200, 300 ... 1000
+
+These rules may seem complicated, but they're essentially just replicating
+what a human does naturally when you say "and so on".
+
+Note that the sequence
+
+    1.0 ... *+0.2, 2.0
+
+is calculated in C<Rat> arithmetic, not C<Num>, so the C<2.0> matches
+exactly and terminates the sequence.
+
+Note: the yada operator is recognized only where a term is expected.
+This operator may only be used where an infix is expected.  If you
+put a comma before the C<...> it will be taken as a yada list operator
+expressing the desire to fail when the list reaches that point:
+
+    1..20, ... "I only know up to 20 so far mister"
+
+If the yada operator finds a closure for its argument at compile time,
+it should probably whine about the fact that it's difficult to turn
+a closure into an error message.  Alternately, we could treat
+an ellipsis as special when it follows a comma to better support
+traditional math notation.
+
 In slice context the function's return value is appended as a capture
 rather than as a flattened list of values, and the argument to each
 function call is the previous capture in the list.
 
+If a series is generated using a non-monotonic C<.succ> function, it is
+possible for it never to reach the endpoint.  The following matches:
+
+    'A' ... 'Z'
+
+but since 'Z' increments to 'AA', none of these ever terminate:
+
+    'A' ... 'z'
+    'A' ... '_'
+    'A' ... '~'
+
+The compiler is allowed to complain if it notices these, since if you
+really want the infinite list you can always write:
+
+    'A' ... *
+
+To preserve Perl 5 semantics, you'd need something like:
+
+    'A' ... { my $new = $_.succ; $_ ne $endpoint and $new.chars <= 1 ?? $new 
!! () }
+
+But since lists are lazy in Perl 6, we don't try to protect the user this way.
+
 =back
 
 Many of these operators return a list of C<Capture>s, which depending on
@@ -2937,32 +3009,40 @@
 
 The C<..> range operator has variants with C<^> on either end to
 indicate exclusion of that endpoint from the range.  It always
-produces a C<Range> object.  Range objects are immutable (but can
-spawn mutable C<RangeIterator> objects, and a C<RangeIterator> can
-be interrogated for its current C<.from> and C<.to> values,
-which change as they are iterated).  The C<.minmax> method returns
-both as a two-element list representing the interval.  Ranges are not
-autoreversing: C<2..1> is always a null range.  Likewise, C<1^..^2>
-produces no values when iterated, but does represent the interval from
-1 to 2 excluding the endpoints when used as a pattern.  To specify
-a range in reverse use:
+produces a C<Range> object.  Range objects are immutable, and primarily
+used for matching intervals.  C<1..2> is the interval from 1 to 2
+inclusive of the endpoints, whereas 1^..^2 excludes the endpoints
+but matches any real number in between.
 
-    2..1:by(-1)
+Range objects support C<.min> and a C<.max> methods representing
+their left and right arguments.  The C<.minmax> method returns both
+values as a two-element list representing the interval.  Ranges are
+not autoreversing: C<2..1> is always a null range.
+
+If used in a list context, a C<Range> object returns an iterator that
+produces a series of values starting at the min and ending at the max.
+Either endpoint may be excluded using C<^>.  Hence C<1..2> produces
+C<(1,2)> but C<1^..^2> is equivalent to C<2..1> and produces no values (Nil).
+To specify a series that counts down, use a reverse:
+
     reverse 1..2
+    reverse 'a'..'z'
 
-(The C<reverse> is preferred because it works for alphabetic ranges
-as well.)  Note that, while C<.minmax> normally returns C<(.from,.to)>,
-a negative C<:by> causes the C<.minmax> method returns C<(.to,.from)>
-instead.  You may also use C<.min> and C<.max> to produce the individual
-values of the C<.minmax> pair, but again note that they are reversed
-from C<.from> and C<.to> when the step is negative.  Since a reversed
-C<Range> changes its direction, it swaps its C<.from> and C<.to> but
-not its C<.min> and C<.max>.
+Alternately, for numeric sequences, you can use the series operator instead
+of the range operator:
 
-Because C<Range> objects are lazy, they do not automatically generate
-a list.  They only do so when iterated.
-One result of this is that a reversed C<Range> object is still lazy.
-Another is that smart matching against a C<Range> object smartmatches the
+    100,99,98 ... 0
+    100 ... *-1, 0      # same thing
+
+In other words, any C<Range> used as a list assumes C<.succ> semantics,
+never C<.pred> semantics.  No other increment is allowed; if you wish
+to increment a numeric sequence by some number other than 1, you must
+use the C<...> series operator.  (The C<Range> operator's C<:by> adverb
+is hereby deprecated.)
+
+    0 ... *+0.1, 100    # 0, 0.1, 0.2, 0.3 ... 100
+
+Smart matching against a C<Range> object smartmatches the
 endpoints in the domain of the object being matched, so fractional
 numbers are C<not> truncated before comparison to integer ranges:
 
@@ -2974,43 +3054,21 @@
 typespace the range is operating, as inferred from the left operand.
 A C<*> on the left means "negative infinity" for types that support
 negative values, and the first value in the typespace otherwise as
-inferred from the right operand.  (For signed infinities the signs
-reverse for a negative step.)  A star on both sides prevents any type
-from being inferred other than the C<Ordered> role.
+inferred from the right operand.  (A star on both sides is not allowed.)
 
     0..*        # 0 .. +Inf
-    'a'..*      # 'a' .. 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzz...'
+    'a'..*      # 'a' le $_
     *..0        # -Inf .. 0
-    *..*        # "-Inf .. +Inf", really Ordered
+    *..*        # Illegal
     1.2.3..*    # Any version higher than 1.2.3.
     May..*      # May through December
 
-Note: infinite lists are constructed lazily.  And even though C<*..*>
-can't be constructed at all, it's still useful as a selector object.
-
-For any kind of zip or dwimmy hyper operator, any list ending with C<*>
-is assumed to be infinitely extensible by taking its final element
-and replicating it:
-
-    @array, *
-
-is short for something like:
-
-    @array[0...@array], @array[*-1] xx *
-
 An empty range cannot be iterated; it returns a C<Nil> instead.  An empty
 range still has a defined min and max, but the min is greater than the max.
 
-If a range is generated using a magical autoincrement, it stops if the magical
-increment would "carry" and make the next value longer (in graphemes) than the 
"to" value, on the
-assumption that the sequence can never match the final value exactly.  Hence,
-all of these produce 'A' .. 'Z':
+Ranges that are iterated transmute into the corresponding series operator,
+and hence use C<!after> semantics to determine an end to the sequence.
 
-    'A' .. 'Z'
-    'A' .. 'z'
-    'A' .. '_'
-    'A' .. '~'
-
 =item *
 
 The unary C<^> operator generates a range from C<0> up to
@@ -3789,6 +3847,16 @@
 will apply the hyper operator to just the values but return a new
 hash value with the same set of keys as the original hash.
 
+For any kind of zip or dwimmy hyper operator, any list ending with C<*>
+is assumed to be infinitely extensible by taking its final element
+and replicating it:
+
+    @array, *
+
+is short for something like:
+
+    @array[0...@array], @array[*-1] xx *
+
 =head2 Reduction operators
 
 Any infix operator (except for non-associating operators)

Modified: docs/Perl6/Spec/S09-data.pod
===================================================================
--- docs/Perl6/Spec/S09-data.pod        2009-09-21 20:44:52 UTC (rev 28343)
+++ docs/Perl6/Spec/S09-data.pod        2009-09-21 20:57:15 UTC (rev 28344)
@@ -13,8 +13,8 @@
 
     Created: 13 Sep 2004
 
-    Last Modified: 17 Jun 2009
-    Version: 34
+    Last Modified: 21 Sep 2009
+    Version: 35
 
 =head1 Overview
 
@@ -203,11 +203,53 @@
 
     @dwarves[7] = 'Sneaky';   # Fails with "invalid index" exception
 
+However, it is legal for a C<Range> object to extend beyond the end
+of an array as long as its min value is a valid subscript; the range
+is truncated as necessary to map only valid locations.
+
 It's also possible to explicitly specify a normal autoextending array:
 
     my @vices[*];             # Length is: "whatever"
                               # Valid indices are 0..*
 
+For subscripts containing ranges extending beyond the end of
+autoextending arrays, the range is truncated to the actual current
+size of the array rather than the declared size of that dimension.
+It is allowed for such a range to start one after the end, so that
+
+    @array[0..*]
+
+merely returns Nil if C<@array> happens to be empty.  However,
+
+    @array[1..*]
+
+would fail because the range's min is too big.
+
+Going the other way, it is allowed for a range to start with a negative
+number as long as the endpoint is at least -1; in this case the
+front of the range is truncated.
+
+Note that these rules mean it doesn't matter whether you say
+
+    @array[*]
+    @array[0 .. *]
+    @array[0 .. *-1]
+    @array[-Inf .. *-1 ]
+
+because they all end up meaning the same thing.
+
+As a special form, numeric subscripts may be declared as cyclical
+using an initial C<%>:
+
+    my @seasons[%4];
+
+In this case, all numeric values are taken modulo 4, and no range truncation 
can
+ever happen.  If you say
+
+    @seasons[-4..7] = 'a' .. 'l';
+
+then each element is written three times and the array ends up with 
C<['i','j','k','l']>.
+
 =head1 Typed arrays
 
 The type of value stored in each element of the array (normally C<Object>)
@@ -492,7 +534,6 @@
 but not:
 
     my @virtue{*..6};
-    my @koalas{*..*};
     my @celebs{*};
 
 These last three are not allowed because there is no first index, and
@@ -633,48 +674,15 @@
     @array[*+1]       # Second element after the end of the array
 
     @array[*-3..*-1]  # Slice from third-last element to last element
+    @array[*-3..*]    # (Same thing via range truncation)
 
 (Note that, if a particular array dimension has fixed indices, any
-attempt to index elements after the last defined index will fail.)
+attempt to index elements after the last defined index will fail,
+except in the case of range truncation described earlier.)
 
-Using a standard index less than zero prepends the corresponding number
-of elements to the start of the array and then maps the negative index
-back to zero:
+Negative subscripts are never allowed for standard subscripts unless
+the subscript is declared modular.
 
-    @results[-1] = 42;      # Same as: @results.unshift(42)
-
-    @dwarves[-2..-1]        # Same as: @dwarves.unshift(<Groovy Sneaky>)
-        = <Groovy Sneaky>;
-
-Note that, as with a normal C<unshift>, the new elements are
-actually stored starting at standard index zero, after pre-existing
-elements have been bumped to the right. Hence after the assignments
-in the preceding example:
-
-    say @results[0];        # 42
-    say @dwarves[0];        # Groovy
-
-Using a negative index on an array of fixed size will fail if the
-resulting number of elements exceeds the defined size.
-
-Note that the behaviour of negative indices in Perl 6 is
-different to that in Perl 5:
-
-    # Perl 5...
-    ............_____________________________..................
-         :     |     |     |     |     |     |     :     :
-    .....:.....|_____|_____|_____|_____|_____|.....:.....:.....
-                 [0]   [1]   [2]   [3]   [4]   [5]   [6]   [7]
-     [-7]  [-6]  [-5]  [-4]  [-3]  [-2]  [-1]
-
-
-    # Perl 6...
-    ............_____________________________..................
-         :     |     |     |     |     |     |     :     :
-    .....:.....|_____|_____|_____|_____|_____|.....:.....:.....
-     [-2]  [-1]  [0]   [1]   [2]   [3]   [4]   [5]   [6]   [7]
-    [*-7] [*-6] [*-5] [*-4] [*-3] [*-2] [*-1] [*+0] [*+1] [*+2]
-
 The Perl 6 semantics avoids indexing discontinuities (a source of subtle
 runtime errors), and provides ordinal access in both directions at both
 ends of the array.

r28344 - docs/Perl6/Spec

Reply via email to