On Mon, 23 Sep 2002, Leopold Toetsch wrote:

> Josef Hook wrote:
> 
> > 
> 
> >>Congrats to you too. So, should I start maintaining a birthday
> >>database for the summaries? Probably not.
> >>
> > 
> > 23 on 26th :-)
> 
> 
> Congrates to you too. Let's store these dates in MultiArray ;-)
> 
> The latter is the reason, I reply.
> First I could take it over, if no one else does and when there is some 
> consens, what we actually want to use as the base ENTRY item in our 
> aggregates.
> Second, could you elaborate a bit on this virtual_addr thing and the 
> IMHO ugly lookup code related to this. I really don't understand, what 
> this is good for.
> 


I think some explanation is in place. First i like to give some history
how multiarrays was created. In May i started to work on a matrix
implementation for parrot. Getting Matrices into parrot has been my goal
all the time. In the early talks of matrices and multiarrays it was
suggested that i first created a good multiarray implementation and later
on a matrix one, since matrices is "just a special case" of multiarrays.
I didnt follow that suggestion but jumped directly onto a matrix
implementation. On Aug 1 i was finished with a matrix implementation that 
could handle dense and sparse matrices WITHOUT changing internal structure
representation and not loosing speed, pretty generic thatis! 
Though it didnt made it into parrot mainly because i created "external" c 
files outside pmc file. ( extactly what intlist patch did a few weeks ago
). Everything in one file was the consensus back then. 
I felt somewhat dejected, but i soon started to put
codepieces into pmc files. I found the matrix code was
impossible to put in one file since i had matrix.ops'es that prevented
it therefore i started to work on multiarrays. 
Thats why the code looks like it does.
80% of code is from matrix.pmc and from multiple files.



Multiarrays consists of 2 datastructure parts 

1. a transformation function (R^n->R) that transforms positions
in many dimensions into a position in single dimension. It's called
calc_offset_multi()

2. some very wierd (just because it consist of partial matrix
code) intarray code.

Now to the other question what is virtual_addr?:

It is offcourse from matrices where we have to deal with sparsity.
I'l try to explain how i've been thinking.

Think of a 3 dim cube with 3*3*3 cells. 
A cell in that cube points to some data and it knows its location in that 
cube.

pseudo:

cell {
 void *data
 int location[3][2][3]
}


Instead of storing every location[][][] in every cell i store the
output from calc_offset_multi() that transform R^n problems into R.
This value is stored in virtual_addr and it is an offset value from
baseaddr. Why i called it virtual_addr is just that it isnt an absolute
address offset but is added onto baseaddr. (maybe not the best name for
such a thing). 
But the question remains why do we need to store our
location in each cell ? 
Cant we just skip that part and store it directly
on the offset from calc_offset_multi()? 
This is true for multiarray case.
We dont need virtual_addr it is used when matrices become sparse.
When dealing with sparse matrices, the location of a cell may not lie 
at the location that we calculate with calc_offset_multi causing a
somewhat complex search algoritm to go out and collect the correct cell
that we want. That happens when we dont store everything that we are told
to store. ex: We dont store '0' in sparse matrices. 

So you see that much code isnt necesary at all (in multiarray
case) and could be replaced with code from perlarray.pmc or array.pmc

I hope that i've managed to explain some of my ideas and background.
( with lsquare( numof misspelled words )) 

/Josef


Reply via email to