Re: PDL-P: indexing

Jeremy Howard Fri, 04 Aug 2000 16:40:35 -0700
<Cross-posted from language to internals since we're getting into
implementation issues>

Joshua N Pritikin wrote:
> <...>
> Also, you can specify a non-default step size:
>
>   @pdl(1:9:2, 1:9:2);  # (1,1) (3,1) (5,1) (7,1) (9,1) (1,3) (3,3) ...
>
> Although I'm not sure how frequently custom step sizes are used in PDL
> code...
>
More generally, we need to be able to specify:
- Slices: non-default step sizes here allow n-dimensional tensors to be
handled gracefully
- Indirect indexes: eg. @a=(3,4,7); @b[@a] works as in perl now
- Masks: eg. @a=(0,0,1,1,0,0,1); @b->mask[@a] works as above

As you can see, in terms of the semantics, simple slice() and mask()
functions can on the face of it be used to generate the indirect index
approach that perl already supports.

Unless your array is of infinite size of course... maybe we don't want to
get into that right now...

There's a bit to be done in the implementation to make sure that real
numerical programs don't grind to a halt. For instance, it's not unusual to
want to say (this is not real code):
@a = sumover(@b[$INDEX::j]*@c[$INDEX::j])
where $INDEX::j is a column iterator, and @c and @b are matrices, and '*' is
component-wise multiplication. @a then becomes the sum of the component-wise
multiplications of columns in @b and @c.

Now what if @c and @b are sparse matrices? That means they're actually:
@c = @craw[@cindexes]; @b = @braw[@bindexes];
Then in every loop we're following the indirect index, doing a lookup to
find the right spot in column vector, and in the worst case generating a
temporary matrix of the size of @b and doing a redundant loop.

Perl has all the information it needs at compile time though to combine the
indirect index and the slice, and to avoid the redundant memory copy and
loop in the reduce (as I've mentioned previously on perl6-internals).

Sparse arrays/matrices/tensors are not rare. Transposing them is common too,
as is reducing along one dimension. Languages which don't deal with these
issues gracefully at compile time simply can't be used for numerical
programming (because they can be an order of magnitude slower by not doing
the optimisations I've mentioned).

As Karl Glazebrook described so eloquently, PDL currently jumps through all
kinds of hoops to do appropriate optimisations. However, because this all
has to be done in C, it's hard for us Mere Mortals so write our own
functions. It would be just _so_ fun to do all this in pure perl.
Re: PDL-P: indexing

Reply via email to