All very good points but note that spark-itemsimilarity may take the input directly since you specify column numbers for <UID><ITEMID><PREF_VALUE>
On Nov 26, 2014, at 11:43 AM, parnab kumar <[email protected]> wrote: kindly elaborate... your requirements... your dataset fields ...and what you want to recommend to an user... Usually a set of item is recommended to an user. In your case what are your items ? The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your data is not in this format which will let you use directly the algorithms in Mahout. A little more info from your side will help us to give your the right pointers. On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <[email protected]> wrote: > Dear Mahout Team, > > I am a student new to machine learning and i am trying to build a user > based recommender using mahout. > > My dataset is a csv file as an input but it has many fields as text and i > understand mahout needs numeric values. > > Can you give me a headstart as to where i should start and what kind of > tools i need to parse the text colummns, > > Also an idea on which classifiers or clustering methods i should use would > be highly appreciated. > > > Best Regards; > Yash Patel >
