Re: Re: Problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark

2014-04-27 Thread Qin Wei
ally want to do this and can tolerate some approximation, I think you want to do some kind of location sensitive hashing to bucket the vectors and then evaluate similarity to only the other items in the bucket. On Fri, Apr 25, 2014 at 5:55 AM, Qin Wei <[hidden email]> wrote: > Hi All

Re: Re: Problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark

2014-04-27 Thread Qin Wei
ally want to do this and can tolerate some approximation, I think you want to do some kind of location sensitive hashing to bucket the vectors and then evaluate similarity to only the other items in the bucket. On Fri, Apr 25, 2014 at 5:55 AM, Qin Wei <[hidden email]> wrote: > Hi All

Problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark

2014-04-24 Thread Qin Wei
Hi All, I have a problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark. The basic flow is as below: (Item1, (User1 , Score1)) RDD1 ==>(Item2, (User2 , Score2))

what is the best way to do cartesian

2014-04-24 Thread Qin Wei
Hi All, I have a problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark. The basic flow is as below: (Item1, (User1 , Score1)) RDD1 ==>(Item2, (User2 , Score2))

Re: Re: Spark program thows OutOfMemoryError

2014-04-17 Thread Qin Wei
eAsTextFile("/home/deployer/sim")} I ran the program through "java -jar myjar.jar", it crashed quickly, but it succeed when the size of the data file is small. Thanks for your help! qinwei  From: Andre Bois-Crettez [via Apache Spark User List]Date: 2014-04-16 17:50To:  Qin WeiSubj

Spark program thows OutOfMemoryError

2014-04-15 Thread Qin Wei
Hi, all My spark program always gives me the error "java.lang.OutOfMemoryError: Java heap space" in my standalone cluster, here is my code: object SimCalcuTotal { def main(args: Array[String]) { val sc = new SparkContext("spark://192.168.2.184:7077", "Sim Calcu Total", "/usr/local/spark