Hi Liquan,

There is some working being done on implementing linear algebra algorithms
on Spark for use in higher-level machine learning algorithms.  That work is
happening in the MLlib project, which has a
org.apache.spark.mllib.linalgpackage you may find useful.

See
https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib/linalg

>From my quick look (never read this code before and not familiar with
MLlib) both the IndexedRowMatrix and RowMatrix implement a multiply
operation:

aash@aash-mbp~/git/spark/mllib/src/main/scala/org/apache/spark/mllib/linalg$
git grep
'def multiply'
distributed/IndexedRowMatrix.scala:  def multiply(B: Matrix):
IndexedRowMatrix = {
distributed/RowMatrix.scala:  def multiply(B: Matrix): RowMatrix = {
aash@aash-mbp~/git/spark/mllib/src/main/scala/org/apache/spark/mllib/linalg$

Can you look into using that code and let us know if it meets your needs?

Thanks!
Andrew


On Sat, May 17, 2014 at 10:28 PM, Liquan Pei <liquan...@gmail.com> wrote:

> Hi
>
> I am currently implementing an algorithm involving matrix multiplication.
> Basically, I have matrices represented as RDD[Array[Double]]. For example,
> If I have A:RDD[Array[Double]] and B:RDD[Array[Double]] and what would be
> the most efficient way to get C = A * B
>
> Both A and B are large, so it would not be possible to save either of them
> in memory.
>
> Thanks a lot for your help!
>
> Liquan
>

Reply via email to