Hi,
Matrix computation is critical for algorithm efficiency like least square, 
Kalman filter and so on.
For now, the mllib module offers limited linear algebra on matrix, especially 
for distributed matrix.

We have been working on establishing distributed matrix computation APIs based 
on data structures in MLlib.
The main idea is to partition the matrix into sub-blocks, based on the strategy 
in the following paper.
http://www.cs.berkeley.edu/~odedsc/papers/bfsdfs-mm-ipdps13.pdf
In our experiment, it's communication-optimal.
But operations like factorization may not be appropriate to carry out in blocks.

Any suggestions and guidance are welcome.

Thanks,
Yuxi

Reply via email to