This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Arrays: New operator ';' for creating array slices

=head1 VERSION

  Maintainer: Jeremy Howard <[EMAIL PROTECTED]>
  Date: 8 Sep 2000
  Last Modified: 21 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 205
  Version: 2
  Status: Frozen

=head1 DISCUSSION

The semantics discussed here were accepted by all on the list. The use of
C<;> outside of a list index did not reach consensus, due to concerns that
its implementation may be too inefficient.

RFC 231 is presented as an alternative to RFCs 204 and 205, but since it
provides suggested implementation of a subset of interfaces proposed in
RFCs 204 and 205, they need not be mutually exclusive. This is discussed
further in the implementation section.

=head1 CHANGES

=head2 Since v1

=over 4

=item *

Added discussion of efficiency and RFC 231 in implementation section

=item *

Added comparison to use of slices without C<;>

=back

=head1 ABSTRACT

RFC 204 described and extension of standard list indexing which allows
indexing directly into a multidimensional array by using a list of lists
of coordinates. This RFC describes a new operator C<;> that can create
lists of coordinates corresponding to slices and blocks of
multidimensional arrays. The C<;> operator creates the cartesian product
of its operands as a list of lists. It can only operate within a list
constructor.

=head1 DESCRIPTION

It is proposed that a new operator C<;> be introduced that operates within
a list constructor to create the cartesian product of its operands:

  @lol = ( (1,2) ; (3,4) );   # ([1,3], [1,4], [2,3], [2,4])

The order of the resultant list is to generate pairs by iterating through
the right-hand operand for each element of the left-hand operand in turn,
going from left to right.

If an operand is a list of lists, the C<;> operator creates the cartesian
product of the component lists:

  @lol = ( ([1,3],[1,4]);(5,6) ) # ([1,3,5], [1,3,6], [1,4,5], [1,4,6])

which is equivalent to:

  @lol = ( (1 ; (1,4)); (5,6) ) # ([1,3,5], [1,3,6], [1,4,5], [1,4,6])
  
which, because C<;> is associative, is equivalent to:

  @lol = ( 1 ; (1,4); (5,6) ) # ([1,3,5], [1,3,6], [1,4,5], [1,4,6])

and, because C<;> evaluates its arguments in a list context, and has a
lower precendence than C<,>, it equivalent to:

  @lol = ( 1 ; 1,4 ; 5,6 ) # ([1,3,5], [1,3,6], [1,4,5], [1,4,6])

C<;> is particularly useful for creating slices of multidimensional
arrays:

  my int @array = ([1,2,3],
                   [4,5,6],
                   [7,8,9]);
  @col2 = @array[0..2; 1];   # @array[[0,1],[1,1],[2,1]] == (2,5,8)
  
Allowing C<;> in contexts other than just within a list index leads to
both consistency and convenience with how list slicing is done in Perl 5:

  # Perl 5 behaviour
  @indices = (1,3);
  @list = (3,4,5,6);
  @list[@indices] = (1,2);   # (3,1,5,2)

  # Multidim extension
  @2d_indices = ([0,0],[1,1]);
  @2d_arr = ([3,4,5],[6,7,8]);
  @2d_arr[@2d_indices] = (1,2);   # ([1,4,5],[6,2,8])

  # Slice syntax extension
  @2d_slice = (0..1 ; 0..1);       # ([0,0],[0,1],[1,0],[1,1])
  @2d_arr = ([3,4,5],[6,7,8]);
  @2d_arr[@2d_slice] = ([0,1],[0,1]);   # ([0,1,5],[0,1,8])

Large matrices can be flexibly manipulated using infinite lists (from RFC
24) and list generation functions (from RFC 81):

  my int @matrix = get_big_file();
  my @first_5_odd_cols = ( 0.. ; 1..9:2 ); # ([0,1],[0,3],[0,5],...)
  my @matrix_slice = @matrix[@first_5_odd_cols];

Since RFC 24 as now been retracted, the second line of this would actually
have to be:

  my @first_5_odd_cols = ( 0..10000 ; 1..9:2 ); # ([0,1],[0,3],[0,5],...)

@matrix_slice now contains the whole of columns 1,3,5,7,9 of @matrix.
Furthermore, @first_5_odd_cols can be used to slice another matrix later,
which may be of a different size.

Because this whole-dimension slicing is so common, any argument to C<;>
may be omitted. Omitted operands default to (0..):

  ( ;1..9:2 ) == ( 0.. ; 1..9:2 );

Because of the retraction of RFC 24, it is necessary to limit the use of
whole-dimension slicing syntax to within a list index, since in that
case a finite sized slice can be generated (since the bounds of the list
are known).

Furthermore, in order to create generic slices that return 'all the nth
elements' regardless of the number of dimensions of the array, the
left-most or right-most operand to C<;> may be '*', which expands to (0..)
for every missing dimension of the sliced array:

  my int @b :bounds(1,1,1) = get_some_matrix();
  my @first_elems = @b[0;*];   # @b[[0,0,0],[0,0,1],[0,1,0],[0,1,1]]

The '*' operand may only be used in an array slicing context.

=head1 IMPLEMENTATION

A naive implementation of C<;> when used as a list index would be
extremely inefficient. Although it should have the semantics proposed
here, vital optimisations would mean that often no actual list would be
created. These could include:

=over 4

=item *

Create a lazily generated list, as outlined in the implementation section
of L<RFC 81>

=item *

Create a compact array (see L<RFC 203>) rather than a standard list of
lists

=back

The actual optimisations would depend on context. If the cartesian product
is not being stored, but is only being used as an array index, generation
of a simple stream of tokens may suffice, such as described in L<RFC 231>.
This approach is discussed in the implementation section of L<RFC 81>. Use
of these optimisations need in no way limit the use of C<;> where such
optimisations can not be used.

=head1 REFERENCES

RFC 231: Data: Multi-dimensional arrays/hashes and slices

RFC 202: Overview of multidimensional array RFCs

RFC 81: Lazily evaluated list generation functions

RFC 24: Semi-finite (lazy) lists

=head1 ACKNOWLEDGEMENTS

Buddha Buck: Original suggestion of C<;> for multidimensional array access

Reply via email to