mit a JIRA Issue related to the problem of block matrix
> shuffling given the blocks co-location?
>
>
>
> Best regards, Alexander
>
>
>
> *From:* Burak Yavuz [mailto:brk...@gmail.com]
> *Sent:* Wednesday, July 15, 2015 3:29 PM
>
> *To:* Ulanov, Alexander
> *Cc:* Rakesh Ch
, Alexander
Cc: Rakesh Chalasani; dev@spark.apache.org
Subject: Re: BlockMatrix multiplication
Hi Alexander,
I just noticed the error in my logic. There will always be a shuffle due to the
`cogroup`. `join` also uses cogroup, therefore a shuffle is inevitable.
However, the reduceByKey will not cause a
>
> bm.validate()
>
> val t = System.nanoTime()
>
> // multiply matrix with itself
>
> val aa = bm.multiply(bm)
>
> aa.validate()
>
> println(rows + "x" + columns + ", block:" + blockSize + "\t" +
> (System.nanoTime() - t) / 1e9)
>
>
/ 1e9)
Best regards, Alexander
From: Ulanov, Alexander
Sent: Tuesday, July 14, 2015 6:24 PM
To: 'Burak Yavuz'
Cc: Rakesh Chalasani; dev@spark.apache.org
Subject: RE: BlockMatrix multiplication
Hi Burak,
Thank you for explanation! I will try to make a diagonal block matrix and
report y
From: Burak Yavuz [mailto:brk...@gmail.com]
Sent: Tuesday, July 14, 2015 10:14 AM
To: Ulanov, Alexander
Cc: Rakesh Chalasani; dev@spark.apache.org
Subject: Re: BlockMatrix multiplication
Hi Alexander,
From your example code, using the GridPartitioner, you will have 1 column, and
5 rows. When you
sing something or using it wrong.
>
>
>
> Best regards, Alexander
>
>
>
> *From:* Rakesh Chalasani [mailto:vnit.rak...@gmail.com]
> *Sent:* Tuesday, July 14, 2015 9:05 AM
> *To:* Ulanov, Alexander
> *Cc:* dev@spark.apache.org
> *Subject:* Re: BlockMatrix multiplicatio
am missing something or
using it wrong.
Best regards, Alexander
From: Rakesh Chalasani [mailto:vnit.rak...@gmail.com]
Sent: Tuesday, July 14, 2015 9:05 AM
To: Ulanov, Alexander
Cc: dev@spark.apache.org
Subject: Re: BlockMatrix multiplication
Hi Alexander:
Aw, I missed the 'cogrou
Hi Alexander:
Aw, I missed the 'cogroup' on BlockMatrix multiply! I stand corrected. Check
https://github.com/apache/spark/blob/3c0156899dc1ec1f7dfe6d7c8af47fa6dc7d00bf/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala#L361
BlockMatrix multiply uses a custom partiti
Hi Rakesh,
Thanks for suggestion. Each block of original matrix is in separate partition.
Each block of transposed matrix is also in a separate partition. The partition
numbers are the same for the blocks that undergo multiplication. Each partition
is on a separate worker. Basically, I want to
Block matrix stores the data as key->Matrix pairs and multiply does a
reduceByKey operations, aggregating matrices per key. Since you said each
block is residing in a separate partition, reduceByKey might be effectively
shuffling all of the data. A better way to go about this is to allow
multiple b
10 matches
Mail list logo