It looks like your code is making 1 Row per item, which means that
columnSimilarities will compute similarities between users.  If you
transpose the matrix (or construct it as the transpose), then
columnSimilarities should do what you want, and it will return meaningful
indices.
Joseph

On Fri, Apr 24, 2015 at 11:20 PM, amghost <zhengweita...@outlook.com> wrote:

> I have encountered the "all-pairs similarity" problem in my recommendation
> system. Thanks to this databricks blog, it seems RowMatrix may come to
> help.
>
> However, RowMatrix is a matrix type without meaningful row indices, thereby
> I don't know how to retrieve the similarity result after invoking
> columnSimilarities(threshold) for specific item i and j
>
> Below is some details about what I am doing:
>
> 1) My data file comes from Movielens with format like this:
>
> user::item::rating
> 2) I build up a RowMatrix in which each sparse vector i represents the
> ratings of all users to this item i
>
> val dataPath = ...
> val ratings: RDD[Rating] = sc.textFile(dataPath).map(_.split("::") match {
>   case Array(user, item, rate) => Rating(user.toInt, item.toInt,
> rate.toDouble)
> })
> val rows = ratings.map(rating=>(rating.product, (rating.user,
> rating.rating)))
>   .groupByKey()
>   .map(p => Vectors.sparse(userAmount, p._2.map(r=>(r._1-1, r._2)).toSeq))
>
> val mat = new RowMatrix(rows)
>
> val similarities = mat.columnSimilarities(0.5)
> Now I get a CoordinateMatrix similarities. How can I get the similarity of
> specific item i and j? Although it can be used to retrieve a
> RDD[MatrixEntry], I am not sure whether the row i and column j correspond
> to
> item i and j.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-retrieve-item-pair-after-calculating-similarity-using-RowMatrix-tp22654.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to