ratings is an RDD of Rating objects. You can see them created as the second element of the tuple. It's a simple case class:
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala#L66 This is just accessing the user and product field of the objects. On Mon, Aug 4, 2014 at 11:17 PM, Steve Nunez <snu...@hortonworks.com> wrote: > Can one of the Scala experts please explain this bit of pattern magic from > the Spark ML tutorial: _._2.user ? > > As near as I can tell, this is applying the _2 function to the wildcard, and > then applying the ‘user’ function to that. In a similar way the ‘product’ > function is applied in the next line, yet these functions don’t seem to > exist anywhere in the project, nor are they used anywhere else in the code. > It almost makes sense, but not quite. Code below: > > > val ratings = sc.textFile(new File(movieLensHomeDir, > "ratings.dat").toString).map { line => > val fields = line.split("::") > // format: (timestamp % 10, Rating(userId, movieId, rating)) > (fields(3).toLong % 10, Rating(fields(0).toInt, fields(1).toInt, > fields(2).toDouble)) > } > … > val numRatings = ratings.count > val numUsers = ratings.map(_._2.user).distinct.count > val numMovies = ratings.map(_._2.product).distinct.count > > Cheers, > - Steve Nunez > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader of > this message is not the intended recipient, you are hereby notified that any > printing, copying, dissemination, distribution, disclosure or forwarding of > this communication is strictly prohibited. If you have received this > communication in error, please contact the sender immediately and delete it > from your system. Thank You. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org