Is there an example of how to use RankingMetrics ? Let's take the user, document example...we get user x topic and document x topic matrices as the model...
Now for each user, we can generate topK document by doing a sort on (1 x topic)dot(topic x document) and picking topK... Is it possible to validate such a topK finding algorithm using RankingMetrics ? On Wed, Oct 29, 2014 at 12:14 PM, Xiangrui Meng <men...@gmail.com> wrote: > Let's narrow the context from matrix factorization to recommendation > via ALS. It adds extra complexity if we treat it as a multi-class > classification problem. ALS only outputs a single value for each > prediction, which is hard to convert to probability distribution over > the 5 rating levels. Treating it as a binary classification problem or > a ranking problem does make sense. The RankingMetricc is in master. > Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC > should be good to add as well. -Xiangrui > > > On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das <debasish.da...@gmail.com> > wrote: > > Hi, > > > > In the current factorization flow, we cross validate on the test dataset > > using the RMSE number but there are some other measures which are worth > > looking into. > > > > If we consider the problem as a regression problem and the ratings 1-5 > are > > considered as 5 classes, it is possible to generate a confusion matrix > > using MultiClassMetrics.scala > > > > If the ratings are only 0/1 (like from the spotify demo from spark > summit) > > then it is possible to use Binary Classification Metrices to come up with > > the ROC curve... > > > > For topK user/products we should also look into prec@k and pdcg@k as the > > metric.. > > > > Does it make sense to add the multiclass metric and prec@k, pdcg@k in > > examples.MovielensALS along with RMSE ? > > > > Thanks. > > Deb >