[ https://issues.apache.org/jira/browse/FLINK-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Domokos Miklós Kelen updated FLINK-4712: ---------------------------------------- Description: We started working on implementing ranking predictions for recommender systems. Ranking prediction means that beside predicting scores for user-item pairs, the recommender system is able to recommend a top K list for the users. Details: In practice, this would mean finding the K items for a particular user with the highest predicted rating. It should be possible also to specify whether to exclude the already seen items from a particular user's toplist. (See for example the 'exclude_known' setting of [Graphlab Create's ranking factorization recommender|https://turi.com/products/create/docs/generated/graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend.html#graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend] ). The output of the topK recommendation function could be in the form of {{DataSet[(Int,Int,Int)]}}, meaning (user, item, rank), similar to Graphlab Create's output. However, this is arguable: follow up work includes implementing ranking recommendation evaluation metrics (such as precision@k, recall@k, ndcg@k), similar to [Spark's implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems]. It would be beneficial if we were able to design the API such that it could be included in the proposed evaluation framework (see [5157|https://issues.apache.org/jira/browse/FLINK-2157]), which makes it neccessary to consider the possible output type {{DataSet[(Int, Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, array of items), possibly including the predicted scores as well. See [issue todo] for details. Another question arising is whether to provide this function as a member of the ALS class, as a switch-kind of parameter to the ALS implementation (meaning the model is either a rating or a ranking recommender model) or in some other way. was: We started working on implementing ranking predictions for recommender systems. Ranking prediction means that beside predicting scores for user-item pairs, the recommender system is able to recommend a top K list for the users. Details: In practice, this would mean finding the K items for a particular user with the highest predicted rating. It should be possible also to specify whether to exclude the already seen items from a particular user's toplist. (See for example the 'exclude_known' setting of [Graphlab Create's ranking factorization recommender|https://turi.com/products/create/docs/generated/graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend.html#graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend]. The output of the topK recommendation function could be in the form of {{DataSet[(Int,Int,Int)]}}, meaning (user, item, rank), similar to Graphlab Create's output. However, this is arguable: follow up work includes implementing ranking recommendation evaluation metrics (such as precision@k, recall@k, ndcg@k), similar to [Spark's implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems]. It would be beneficial if we were able to design the API such that it could be included in the proposed evaluation framework (see [5157|https://issues.apache.org/jira/browse/FLINK-2157]), which makes it neccessary to consider the possible output type {{DataSet[(Int, Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, array of items), possibly including the predicted scores as well. See [issue todo] for details. Another question arising is whether to provide this function as a member of the ALS class, as a switch-kind of parameter to the ALS implementation (meaning the model is either a rating or a ranking recommender model) or in some other way. > Implementing ranking predictions for ALS > ---------------------------------------- > > Key: FLINK-4712 > URL: https://issues.apache.org/jira/browse/FLINK-4712 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Domokos Miklós Kelen > > We started working on implementing ranking predictions for recommender > systems. Ranking prediction means that beside predicting scores for user-item > pairs, the recommender system is able to recommend a top K list for the users. > Details: > In practice, this would mean finding the K items for a particular user with > the highest predicted rating. It should be possible also to specify whether > to exclude the already seen items from a particular user's toplist. (See for > example the 'exclude_known' setting of [Graphlab Create's ranking > factorization > recommender|https://turi.com/products/create/docs/generated/graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend.html#graphlab.recommender.ranking_factorization_recommender.RankingFactorizationRecommender.recommend] > ). > The output of the topK recommendation function could be in the form of > {{DataSet[(Int,Int,Int)]}}, meaning (user, item, rank), similar to Graphlab > Create's output. However, this is arguable: follow up work includes > implementing ranking recommendation evaluation metrics (such as precision@k, > recall@k, ndcg@k), similar to [Spark's > implementations|https://spark.apache.org/docs/1.5.0/mllib-evaluation-metrics.html#ranking-systems]. > It would be beneficial if we were able to design the API such that it could > be included in the proposed evaluation framework (see > [5157|https://issues.apache.org/jira/browse/FLINK-2157]), which makes it > neccessary to consider the possible output type {{DataSet[(Int, > Array[Int])]}} or {{DataSet[(Int, Array[(Int,Double)])]}} meaning (user, > array of items), possibly including the predicted scores as well. See [issue > todo] for details. > Another question arising is whether to provide this function as a member of > the ALS class, as a switch-kind of parameter to the ALS implementation > (meaning the model is either a rating or a ranking recommender model) or in > some other way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)