On May 26, 2009, at 14:59, aperotte wrote:

> Numpy does things a bit differently, I think in a large part because
> their format is row major.  Based on my understanding of the two
> formats, in the multidimensional generalization of row and column
> major the left most and the right most indices vary fastest,
> respectively.  Therefore, if the way you construct your
> multidimensional array is to flatten the array then index based on one
> of these principles then you're either going to get a 3x2 or a 2x3
> matrix, respectively, for your previous example.

I don't see how row/column-major order makes a difference for the  
shape of the matrix or its indexation. As far as I can see, it makes  
a difference only for reshaping, including flattening.

Using my example again, I start from [[1 2] [3 4] [5 6]], which is a  
vector of length 3 containing vectors of length 2. It seems  
reasonable to me to define the shape of the resulting array as [3 2],  
i.e. map the outermost vector level to the first array index. The  
reason for this choice is simply that for indexing the nested vector  
structure, I need to index
the outemost level first (e.g.  (-> nested-vectors (nth 0) (nth 1))),  
and therefore I want to have the same index order for the matrix. I  
don't really care about the storage order at this point.

The internal flattened representation does of course depend on the  
chosen storage order. With row-major order (C- style), the flattened  
array is [1 2 3 4 5 6], whereas with column-major order (Fortran- 
style), it is [1 3 5 2 4 6].

If I understand your reasoning correctly, your primary criterion is  
that the flattened array should have the elements in the same order  
as a flattened version of the nested vectors, which for column-major  
ordering leads to indices being reversed compared to the nested  
vectors. I don't think that storage order should be the main  
criterion for such a choice. Storage order matters for applications  
that reshape arrays, and also for applications that pass the array  
data to Java code. But for me it is a less fundamental property of  
multidimensional data structures than index order.

> For example, say that you wanted to conj the first matrix with [[7 8]
> [9 10] [11 12]].  I would want the resulting matrix to be a 2x3x2
> matrix,

I wouldn't know how to define conj in this case, given that for other  
collections it adds an element to a collection. But what is an  
"element" for a multidimensional structure?

But let's consider concat, which does make sense for multidimensional  
structures if they are of compatible dimensions. Again, I would  
expect that with

        (def nv1 [[1 2] [3 4] [5 6]])
        (def nv2 [[7 8] [9 10] [11 12]])

I get the same result from

        (concat (matrix nv1) (matrix nv2))

as from

        (matrix (concat nv1 nv2))

So the result should be a 6x2 matrix.

> When in the frame of building up structures, I think it makes sense to
> keep the left-most index stable and continually representing a
> particular dimension (rows).  Whereas in Numpy, the only way to keep
> the left-most index stable is to build your data structure from the
> "inside out". Or in other words, the indices representing 1, 2, 3, 4,
> 5, and 6 shift to the right in Numpy as you build up a structure with
> increasing amounts of nesting.

Right. That seems perfectly natural to me - because it works just  
like nested vectors.

Konrad.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to