just looking at a comparision between Matlab and Spark for svd with an
input matrix N
this is matlab code - yes very small matrix!!!!
N =
2.5903 -0.0416 0.6023
-0.1236 2.5596 0.7629
0.0148 -0.0693 0.2490
U =
-0.3706 -0.9284 0.0273
-0.9287 0.3708 0.0014
-0.0114 -0.0248 -0.9996
------------------------
Spark code
// Breeze to spark
val N1D = N.reshape(1, 9).toArray
// Note I had to transpose array to get correct values with incorrect signs
val V2D = N1D.grouped(3).toArray.transpose
// Then convert the array into a RDD
// val NVecdis = Vectors.dense(N1D.map(x => x.toDouble))
// val V2D = N1D.grouped(3).toArray
val rowlocal = V2D.map{x => Vectors.dense(x)}
val rows = sc.parallelize(rowlocal)
val mat = new RowMatrix(rows)
val mat = new RowMatrix(rows)
val svd = mat.computeSVD(mat.numCols().toInt, computeU=true)
------------------------
Spark Output - notice the change in sign on the 2nd and 3rd column
-0.3158590633523746 0.9220516369164243 -0.22372713505049768
-0.8822050381939436 -0.3721920780944116 -0.28842213436035985
-0.34920956843045253 0.10627246051309004 0.9309988407367168
And finally some julia code
N = [2.59031 -0.0416335 0.602295;
-0.123584 2.55964 0.762906;
0.0148463 -0.0693119 0.249017]
svd(N, thin=true) --- same as matlab
-0.315859 -0.922052 0.223727
-0.882205 0.372192 0.288422
-0.34921 -0.106272 -0.930999
Most likely its an issue with my implementation rather than being a bug
with svd within the spark environment
My spark instance is running locally with a docker container
Any suggestions
tks