KNN for large data set
Hi all, Please help me to find out best way for K-nearest neighbor using spark for large data sets.
Re: KNN for large data set
Thanks Xiangrui Meng will try this. And, found this https://github.com/kaushikranjan/knnJoin also. Will this work with double data ? Can we find out z value of *Vector(10.3,4.5,3,5)* ? On Thu, Jan 22, 2015 at 12:25 AM, Xiangrui Meng wrote: > For large datasets, you need hashing in order to compute k-nearest > neighbors locally. You can start with LSH + k-nearest in Google > scholar: http://scholar.google.com/scholar?q=lsh+k+nearest -Xiangrui > > On Tue, Jan 20, 2015 at 9:55 PM, DEVAN M.S. wrote: > > Hi all, > > > > Please help me to find out best way for K-nearest neighbor using spark > for > > large data sets. > > >
Re: How to create a Row from a List or Array in Spark using Scala
In scala API its there, Row.fromSeq(ARRAY), I dnt know much more about java api Devan M.S. | Research Associate | Cyber Security | AMRITA VISHWA VIDYAPEETHAM | Amritapuri | Cell +919946535290 | On Sat, Feb 28, 2015 at 1:28 PM, r7raul1...@163.com wrote: > import org.apache.spark.sql.catalyst.expressions._ > > val values: JavaArrayList[Any] = new JavaArrayList() > computedValues = Row(values.get(0),values.get(1)) //It is not good by use > get(index). How to create a Row from a List or Array in Spark using Scala . > > > > r7raul1...@163.com >