Github user BenFradet commented on a diff in the pull request:
https://github.com/apache/spark/pull/10411#discussion_r52830580
--- Diff: docs/mllib-collaborative-filtering.md ---
@@ -32,16 +32,16 @@ following parameters:
The standard approach to matrix factorization based collaborative
filtering treats
the entries in the user-item matrix as *explicit* preferences given by the
user to the item.
+For example, users giving ratings to movies.
It is common in many real-world use cases to only have access to *implicit
feedback* (e.g. views,
clicks, purchases, likes, shares etc.). The approach used in `spark.mllib`
to deal with such data is taken
-from
-[Collaborative Filtering for Implicit Feedback
Datasets](http://dx.doi.org/10.1109/ICDM.2008.22).
-Essentially instead of trying to model the matrix of ratings directly,
this approach treats the data
-as a combination of binary preferences and *confidence values*. The
ratings are then related to the
-level of confidence in observed user preferences, rather than explicit
ratings given to items. The
-model then tries to find latent factors that can be used to predict the
expected preference of a
-user for an item.
+from [Collaborative Filtering for Implicit Feedback
Datasets](http://dx.doi.org/10.1109/ICDM.2008.22).
+Essentially, instead of trying to model the matrix of ratings directly,
this approach treats the data
--- End diff --
@srowen tried to take your remarks into account, I don't know if it's
clearer now though.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]