Matrix, array, or tensor? (was Re: n-dim matrices)

Jeremy Howard Mon, 04 Sep 2000 01:59:30 -0700
Christian Soeller wrote:
> Buddha Buck wrote:
>
> >   Tensor or Matrix
> >  Multidimensional list
> > what should we call it?
> >
> > I'd vote for matrix myself.  It's short and sweet
>
> Fine ;) Just have a section in the elusive overview RFC that defines
> what we mean by matrix, e.g. not only 2D objects of linear algebra.
>
In the RFCs I just wrote I've taken to calling them multidimensional arrays
(which is probably the most 'correct' term), and included a fair bit of the
overview RFC connecting up this term with the contents of the RFCs. I'm a
bit concerned that 'array' has come to mean something different in Perl,
which is why I've tried in the overview RFC to define it reasonably
carefully. I've attached the draft RFC--see what you think.

> With all these RFCs it would be nice to prefix their titles with one
> common term (as Damian Conway has done with his recent Objects RFCs):
>
>  Matrices: Proposed syntax for matrix element access and slicing
>  Matrices: Matrix Index Iterators
>
That's a good idea.
----

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Overview of multidimensional array RFCs (<RFCB> through <RFCH>)

=head1 VERSION

  Maintainer: Jeremy Howard <[EMAIL PROTECTED]>
  Created: 3 September 2000
  Status: Developing
  Last modified: 3 September 2000
  Version: 1
  Mailing List: [EMAIL PROTECTED]
  Number: ?

=head1 ABSTRACT

Adding multidimensional array syntax to Perl 6 requires a large number of
separate but highly connected language and internals changes. Each of these
changes has its own RFC. This RFC describes how these changes fit together,
and provides a 'reading guide' through the multidimensional array RFCs.

=head1 DESCRIPTION

Arrays are data structures that store a series of elements all of the same
type in a contiguous area of memory. The elements of an array can most
simply be indexed by the count of the number of elements into the array.
This style of indexing results in a one dimensional array, also called a
I<vector>. A more sophisticated approach allows indexing into a two
dimensional plane of elements, where the plane is 'flattened' by laying the
rows or columns end on end in order to find the correct offset into the area
of memory. Two dimensional arrays are called I<matrices>. Arrays of more
than two dimensions follow the same logic, but use coordinate systems of
three or more coordinates for their indexing. These arrays mirror the
mathematical structures known as I<tensors>.

Perl 5 does not strictly provide a syntax for defining arrays, since the
closest equivalent in Perl 5, the I<list>, can contain different types of
element within one structure. Using a list in Perl 5 to mirror a one
dimensional array leads to a loss of efficiency, because the elements of a
list may be of different sizes, and can therefore not be jumped to directly.

Perl 5 does not strictly provide a syntax for indexing arrays of greater
than one dimension, however the use of a I<list of lists> (or I<LOL>) allows
an approximation, as described in L<perllol> in the Perl 5 documentation.
The LOL structure does not guarantee that sub-lists are of equal size,
which, with the lack of guarantee that list elements are of equal size,
results in a loss of efficiency. Furthermore, the syntax for indexing LOLs:

  $scalar = $lol[$i][$j][$k];

does not allow multiple elements to be accessed in a way that takes
advantage of the coordinate system (such as taking all elements that are one
plane of a three dimensional array).

The multidimensional array RFCs describe a set of language and internals
changes that together provide the two key foundations of arrays:

=over 4

=item *

Declaration of a data structure that contains elements of the same type
stored contiguously in memory

=item *

Ability to index arrays using a multidimensional coordinate syntax

=back

In addition, the RFCs describe syntax that allows the more rigid structure
of an array to be utilised to create more efficient programs.

The following RFCs describe the proposals:

 <RFCB>- Notation for declaring and creating arrays
 <RFCC>- Notation for indexing arrays with an LOL as an index
 <RFCD>- New operator ';' for creating array slices
 <RFCE>- @#arr for getting the dimensions of a array
 <RFCG>- Efficient array loops
 <RFCH>- Extension of component-wise list operations (RFC 82) to
multidimensional arrays

<RFCB> describes the notation to create data structures that contain
elements of the same type stored contiguously in memory. <RFCC> describes
the notation to index arrays in multiple dimensions, and <RFCD> describes
how to utilise the coordinate nature of indices to index multiple elements
easily. These three RFCs provide the core foundation of arrays in Perl 6.

<RFCE> provides the syntax to query arrays to find their structure at
runtime.

Finally, <RFCG> and <RFCH> provide the means to operate efficiently on
multidimensional arrays, bypassing Perl's more flexible but slower generic
looping approaches.

The multidimensional array RFCs rely on the lazily generated list generation
syntax provided by RFC 81 for creating slices, and on the reduce() builtin
provided by RFC 76 for reducing arrays.

=head1 REFERENCES

<perllol> in the Perl 5 documentation

Arrays in Numeric Python: http://starship.python.net/~da/numtut/array.html

Arrays in Haskell:
http://haskell.cs.yale.edu/haskell-report/newlib/Array.html

Arrays in Perl Data Language:

http://pdl.sourceforge.net/PDLdocs/Impatient.html#Perl_Datatypes_and_how_PDL
_exten

Arrays in Blitz++ (efficient C++ library):
  http://oonumerics.org/blitz/manual/blitz02.html#l26
Matrix, array, or tensor? (was Re: n-dim matrices)

Reply via email to