Re: LabeledPoint with features in matrix form (word2vec matrix)

2016-04-07 Thread jamborta
depends, if you'd like to multiply matrices for each row in the data, then you could use a breeze matrix, and do that locally on the nodes in a map or similar. if you'd like to multiply them across the rows, eg. a row in your data is a row in the matrix, then you could use a distributed matrix lik

Re: LabeledPoint with features in matrix form (word2vec matrix)

2016-04-06 Thread jamborta
you probably better off defining your own data structure. labelled point can store a label, vector. but in your case is more like a label, vector, vector. i'd probably use tuples with breeze sparse arrays: RDD[(label:Int, vector1:SparseArray[Double], vector2:SparseArray[Double])] -- View this m