Hi,

I’m trying to multiply a large squarish matrix with its transpose. Eventually 
I’d like to work with matrices of size 200,000 by 500,000, but I’ve started off 
first with 100 by 100 which was fine, and then with 10,000 by 10,000 which 
failed with an out of memory exception.

I used MLlib and BlockMatrix and tried various block sizes, and also tried 
switching disk serialisation on.

We are running on a small cluster, using a CSV file in HDFS as the input data. 

Would anyone with experience of multiplying large, dense matrices in spark be 
able to comment on what to try to make this work?

Thanks,
Eilidh


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to