Re: mllib sparse vector/matrix vs. graphx graph

2014-10-05 Thread Xiangrui Meng
It really depends on the type of the computation. For example, if vertices and edges are associated with properties and you want to operate on (vertex-edge-vertex) triplets or use the Pregel API, GraphX is the way to go. -Xiangrui On Sat, Oct 4, 2014 at 9:39 PM, ll wrote: > hi. i am working on a

Re: MLLib sparse vector

2014-09-15 Thread Chris Gore
Probably worth noting that the factory methods in mllib create an object of type org.apache.spark.mllib.linalg.Vector which stores data in a similar format as Breeze vectors Chris On Sep 15, 2014, at 3:24 PM, Xiangrui Meng wrote: > Or you can use the factory method `Vectors.sparse`: > > val

Re: MLLib sparse vector

2014-09-15 Thread Xiangrui Meng
Or you can use the factory method `Vectors.sparse`: val sv = Vectors.sparse(numProducts, productIds.map(x => (x, 1.0))) where numProducts should be the largest product id plus one. Best, Xiangrui On Mon, Sep 15, 2014 at 12:46 PM, Chris Gore wrote: > Hi Sameer, > > MLLib uses Breeze’s vector fo

Re: MLLib sparse vector

2014-09-15 Thread Chris Gore
Hi Sameer, MLLib uses Breeze’s vector format under the hood. You can use that. http://www.scalanlp.org/api/breeze/index.html#breeze.linalg.SparseVector For example: import breeze.linalg.{DenseVector => BDV, SparseVector => BSV, Vector => BV} val numClasses = classes.distinct.count.toInt val