then - implementing from scratch using the
> coordinate matrix ((i,j), k) format?
>
>> On Wed, Jun 14, 2017 at 4:29 PM, John Compitello
>> wrote:
>> Hey Anthony,
>>
>> You're the first person besides myself I've seen mention this. BlockMatrix
>
Hey Anthony,
You're the first person besides myself I've seen mention this. BlockMatrix
multiply is not the best method. As far as me and my team can tell, the memory
problem stems from the fact that when Spark tries to compute block (i, j) of
the matrix, it tries to manifest all of row i from
Spark is doing operations on each partition in parallel. If you decrease number
of partitions, you’re potentially doing less work in parallel depending on your
cluster setup.
> On May 23, 2017, at 4:23 PM, Andrii Biletskyi
> wrote:
>
>
> No, I didn't try to use repartition, how exactly it
Hey all,
I’ve found myself in a position where I need to do a relatively large matrix
multiply (at least, compared to what I normally have to do). I’m looking to
multiply a 100k by 500k dense matrix by its transpose to yield 100k by 100k
matrix. I’m trying to do this on Google Cloud, so I don’