Let's narrow the context from matrix factorization to recommendation
via ALS. It adds extra complexity if we treat it as a multi-class
classification problem. ALS only outputs a single value for each
prediction, which is hard to convert to probability distribution over
the 5 rating levels. Treating it as a binary classification problem or
a ranking problem does make sense. The RankingMetricc is in master.
Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC
should be good to add as well. -Xiangrui


On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das <debasish.da...@gmail.com> wrote:
> Hi,
>
> In the current factorization flow, we cross validate on the test dataset
> using the RMSE number but there are some other measures which are worth
> looking into.
>
> If we consider the problem as a regression problem and the ratings 1-5 are
> considered as 5 classes, it is possible to generate a confusion matrix
> using MultiClassMetrics.scala
>
> If the ratings are only 0/1 (like from the spotify demo from spark summit)
> then it is possible to use Binary Classification Metrices to come up with
> the ROC curve...
>
> For topK user/products we should also look into prec@k and pdcg@k as the
> metric..
>
> Does it make sense to add the multiclass metric and prec@k, pdcg@k in
> examples.MovielensALS along with RMSE ?
>
> Thanks.
> Deb

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to