Hi, I'm exploring dimsum(Dimension Independent Matrix Squares using Mapreduce) for finding similarites between users in terms of products they have purchased.
I've modeled the matrix as User1, product1,product2,0,0 user2, product2,0,0,0 user3, product1,product3,product4,product2 . ... obviously I tried jaccard similarity measure which ran into performance problems. What is the best way to do find similarities between users in terms of products. Regards, Naveen -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-does-dimsum-works-for-categorical-data-tp26449.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org