implementation of context-aware recommender in Mahout
Hi all, I am trying to implement an context-aware recommender in Mahout. As I haven't use the library before I haven't a lot experience. So, I would really appreciate your response! What I want to do is to implement the two context- aware approaches that have been proposed, pre-filtering and post-filtering. The former filters out the dataset based on the value of contextual factor before the collaborative filtering while the latter rescores the recommendations after the collaborative filtering. I have already read older similar questions regarding the context-aware recommender implementation in mahout and I know that the post-filtering method can be implemented using the IDRescorer. For the pre-filtering approach there is the option to use the CandidateItemsStategy in case of the item-based recommender. On the other hand if we want to implement this approach using the user-bsed recommender no such option is available. In order to implement the pre-filtering using the user-based recommender, I was thinking to filter out the unrelated user,items pairs from the dataset before the creation of the data model. This means that the data model will take as input a subset of the initial dataset. Does this approach sound correct? There are some concerns regarding the evaluation of the recommender. Does it have any impact on this? Thank you in advance! Regards, Efi
spark-itemsimilarity question: what's the difference between indicator-matrix and cross-indicator-matrix
May I say indicator-matrix is for the main action for example purchase and the cross-indicator-matrix is for the secondary action? Thanks a lot, Kevin
Re: spark-itemsimilarity question: what's the difference between indicator-matrix and cross-indicator-matrix
The terms main and secondary are a bit confusing. The easiest definition is that cooccurrence analyzes the record of actions you want to recommend. Cross occurrence tries to transfer from one behavior to another. In practice, it has been common to conflate many behaviors into one precisely because cross occurrence analysis was not feasible. Now that it is available standard practice is moving toward retaining distinction where possible. Sent from my iPhone > On Mar 6, 2015, at 11:08, Kevin Zhang > wrote: > > May I say indicator-matrix is for the main action for example purchase and > the cross-indicator-matrix is for the secondary action? > > Thanks a lot, > Kevin
Re: spark-itemsimilarity question: what's the difference between indicator-matrix and cross-indicator-matrix
Yes, you have it right. The user’s history of the primary acton (purchase) is used as a query against the indicator-matrix and the user’s history of the secondary action (detail-view for instance) is used against the “cross-indicator” But the terminology is being changed to reflect what Ted is saying. 1) The new (current master) naming of the outputs are “similarity-matrix” and “cross-similarity-matrix”, which are LLR measured cooccurrence and cross-cooccurrence. A cross-indicator is not a thing really and is a confusing name. 2) The secondary actions may be many. The CLI job only supports 1 primary and 1 secondary but you can run it in pairs with 1 primary and many secondaries. Also the internal code can calculate correlation between the action you want to recommend and many other actions. All of which create indicators which you query with different history. On Mar 6, 2015, at 3:08 PM, Ted Dunning wrote: The terms main and secondary are a bit confusing. The easiest definition is that cooccurrence analyzes the record of actions you want to recommend. Cross occurrence tries to transfer from one behavior to another. In practice, it has been common to conflate many behaviors into one precisely because cross occurrence analysis was not feasible. Now that it is available standard practice is moving toward retaining distinction where possible. Sent from my iPhone > On Mar 6, 2015, at 11:08, Kevin Zhang > wrote: > > May I say indicator-matrix is for the main action for example purchase and > the cross-indicator-matrix is for the secondary action? > > Thanks a lot, > Kevin
Re: implementation of context-aware recommender in Mahout
The new Spark based recommender can easily handle context in many forms. See the top references section here http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html It does not use the IDRescorer approach at all so perhaps you should describe what you want to use as context. In the demo site for the new stuff (a guide to online video) https://guide.finderbots.com you’ll see a couple examples of “context”. For instance you are viewing a video that has several genre tags. You’ll see at least 3 lists of recommendations: 1) people who like the video you are looking at also like these other viedeos—non-personalized recs 2) people who like this video liked these, from similar genres 3) personalized recs from all genres based on your “liking” history Many other things can be used as context like time of day, location, mobile or desktop, user profile attributes, etc. The way it does this is through the search engine, which can take filters and boost certain item attributes. So I could show only recommendations made in the same year as the viewed movie or use the year to bias recommendations by boosting the “release-date” field in the recommender query. The recommender is also multimodal and so can use many user actions to better the quality of recs. Removing some of your data, in what you call pre-filtering may not get you what you want. Removing data that is actual user behavior can reduce the quality of recommendations so please give an example. On Mar 6, 2015, at 4:45 AM, Efi Koulouri wrote: Hi all, I am trying to implement an context-aware recommender in Mahout. As I haven't use the library before I haven't a lot experience. So, I would really appreciate your response! What I want to do is to implement the two context- aware approaches that have been proposed, pre-filtering and post-filtering. The former filters out the dataset based on the value of contextual factor before the collaborative filtering while the latter rescores the recommendations after the collaborative filtering. I have already read older similar questions regarding the context-aware recommender implementation in mahout and I know that the post-filtering method can be implemented using the IDRescorer. For the pre-filtering approach there is the option to use the CandidateItemsStategy in case of the item-based recommender. On the other hand if we want to implement this approach using the user-bsed recommender no such option is available. In order to implement the pre-filtering using the user-based recommender, I was thinking to filter out the unrelated user,items pairs from the dataset before the creation of the data model. This means that the data model will take as input a subset of the initial dataset. Does this approach sound correct? There are some concerns regarding the evaluation of the recommender. Does it have any impact on this? Thank you in advance! Regards, Efi
Random Forest on old mapred API
Hi All: For some reasons, we need to re-implement the Random Forest in mahout based on old MapRed API to run it on our Hadoop deployment, we know that old MapRed API is different from new MapReduce API, could you please give me some hint on how to do this? many thanks. Best Wei