The mahout + search engine recommender seems what would be best for the data i have.
Kindly get back to me at your earliest convenience. Best Regards, Yash Patel On Thu, Nov 27, 2014 at 9:58 PM, Pat Ferrel <[email protected]> wrote: > Mahout has several recommenders so no need to create one from components. > They all make use of the similarity of preferences between users—that’s why > they are in the category of collaborative filtering. > > Primary Mahout Recommenders: > 1) Hadoop mapreduce item-based cooccurrence recommender. Creates all recs > for all users. Uses “Mahout IDs" > 2) ALS-WR hadoop mapreduce, uses matrix factorization to reduce noise in > the data. Sometimes better for small data sets than #1. Uses “Mahout IDs" > 3) Mahout + search engine: cooccurrence type. Extremely flexible, works > with multiple actions (multi-modal), works for new users that have some > history, has a scalable server (from the search engine) but is more > difficult to integrate than #1 or #2. Uses your own ids and reads csv files. > > The rest of the data seems to apply either to the user or the item and so > would be used in different ways. #1 an #2 can only use user id and item id > but some post recommendation weighting or filtering can be applied. #3 can > use multiple attributes in different ways. For instance if category is an > item attribute you can create two actions, user-pref-for-an-item, and > user-pref-for-a-category. Assuming you want to recommend an item (not > category) you can create a cross-ccoccurrence indicator for the second > action and use the data to make your item recs better. #3 is the only > methods that supports this. > > Pick a recommender and we can help more with data prep. > > > On Nov 26, 2014, at 1:34 PM, Yash Patel <[email protected]> wrote: > > Hello everyone, > > wow i am quite happy to see so many inputs from people. > > I apologize for not providing more details. > > Although this is not my complete dataset the fields i have chosen to use > are: > > customer id - numeric > item id - text > postal code - text > item category ´- text > potential growth - text > territory - text > > > Basically i was thinking of finding similar users and recommending them > items that users like them have bought but they haven't. > > Although i would very much like to hear your opinions as i am not so > familiar with clustering,classifiers etc. > > I found that mahout takes sequence files converted into vectors but i > couldn't understand how would i do it on my data specifically and more > importantly make a recommender system out of it. > > Also i am wondering how to combine the importance of a specific customer > through the potential growth attribute. > > > > > > > Best Regards, > Yash Patel > > On Wed, Nov 26, 2014 at 9:03 PM, Pat Ferrel <[email protected]> wrote: > > > All very good points but note that spark-itemsimilarity may take the > input > > directly since you specify column numbers for <UID><ITEMID><PREF_VALUE> > > > > On Nov 26, 2014, at 11:43 AM, parnab kumar <[email protected]> > wrote: > > > > kindly elaborate... your requirements... your dataset fields ...and what > > you want to recommend to an user... Usually a set of item is recommended > to > > an user. In your case what are your items ? > > > > The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your data is > not > > in this format which will let you use directly the algorithms in Mahout. > > > > A little more info from your side will help us to give your the right > > pointers. > > > > On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <[email protected]> > > wrote: > > > >> Dear Mahout Team, > >> > >> I am a student new to machine learning and i am trying to build a user > >> based recommender using mahout. > >> > >> My dataset is a csv file as an input but it has many fields as text and > i > >> understand mahout needs numeric values. > >> > >> Can you give me a headstart as to where i should start and what kind of > >> tools i need to parse the text colummns, > >> > >> Also an idea on which classifiers or clustering methods i should use > > would > >> be highly appreciated. > >> > >> > >> Best Regards; > >> Yash Patel > >> > > > > > >
