n-dim matrices

Jeremy Howard Tue, 29 Aug 2000 21:22:17 -0700
This post attempts to consolidate and summarise my proposals which I hope
provides a platform for n-dim matrices/tensors. I have repeated parts of
earlier posts to avoid confusion--sorry for the redundancy. I've also fixed
some areas from earlier posts--hopefully the examples here are correct and
consistent.

I propose that lists of list refs (of list refs...), which I call List Ref
Trees, or LRTs, allow an additional syntax to access an element:

  @a = ([0,1,2],
        [3,4,5]);

  $b = $a[[0,1]];          # $b == 1
  @b = @a[[0,1], [1,2]];   # @b == (1, 5)

This syntax is incompatible with <RFC 177 (v1) A Natural Syntax Extension
For Chained References> which is something we will have to sort out.

Please note that I have specified nothing about the implementation of LRTs
yet. This is merely a syntactic choice--this multi-dimensional operator can
be used to access elements of an LRT. This does not imply that the LRT is
actually stored in the same way as standard references currently are.

Note that the syntax is actually that a list of list refs is used to index
the list. This allows very flexible ways generating slices:

  @c = [ [ [0,1] , [1,2] , [3,4] ] ,
         [ [1,2] , [3,4] , [5,6] ] ,
         [ [2,3] , [4,5] , [6,7] ]
       ];                               # A 3d LRT/tensor
  @3d_diag_slice =
    partition(3, zip(0.., 0.., 0..));   # ([0,0,0],[1,1,1],[2,2,2],...)
  @a_diag = @c[@3d_diag_slice];

My second proposal is that LRTs can be declared with a single type:

  my int @a = ([0,1,2],
               [3,4,5]);

My third proposal is that LRTs declared with a single type actually be
stored in a contiguous block of memory (perhaps also a ':compact' attribute
is required to make this happen). This allows any element to be accessed
directly by multiplying out the indexes and the data type size.

My fourth proposal is that standard list ref indexing be extended to allow
lists to be used as indexes:

  @foo[0,3][1,4][2,5] == @foo[[0,1,2], [3,4,5]];   # True

Although currently

  @foo[0,1]

works in this way, this is not true of:

  @foo[0,1][1,2]

which treats the comma operator as evaluating the left side and throwing it
out. Therefore this new syntax would create some incompatibility with P5,
although I don't think it would happen often.

My fifth proposal is to introduce a new variable '@*' which holds the
current loop index in an implied loop, with each element counting each
dimension of the tensor:

  @trans[[$*[1], $*[0]]] = @mat;          # Transpose a matrix
  @uptriangle = @matrix[0..; 0..$*[0]];   # Upper triangle of a matrix

My sixth proposal is to extend the RFCs for reduce(), list generation, and
zip/unzip/partition/reshape to work with n-dim matrices (although I'm not
sure how yet).

My seventh proposal is to introduce a new attribute ':sparse', that
specifies that an list should be stored sparsely, by default not storing
undefs. It would take two parameters:

 - The elements to avoid storing (undef by default)
 - The percentage sparsity (50% by default)

Therefore we could say:

  my int :sparse(0,0.01) @array = (0) x 1000 . (1);

and have the list stored in an efficient way.

In addition, I like the suggestion from RFC 169:

> I'd propose that, analogous to the $#array notation for revealing the
> upper index of @array, @#matrix return the analogous list.

Each of these proposals probably needs its own RFC, but first I'd like to
get some feedback. Are there other ideas around on n-dim tensors? Are there
elements of the ';' approach suggested in RFC 169 that would be better? Are
there incompatibilities that I've missed?
n-dim matrices

Reply via email to