Mahout has several recommenders so no need to create one from components. They 
all make use of the similarity of preferences between users—that’s why they are 
in the category of collaborative filtering.

Primary Mahout Recommenders:
1) Hadoop mapreduce item-based cooccurrence recommender. Creates all recs for 
all users. Uses “Mahout IDs"
2) ALS-WR hadoop mapreduce, uses matrix factorization to reduce noise in the 
data. Sometimes better for small data sets than #1. Uses “Mahout IDs"
3) Mahout + search engine: cooccurrence type. Extremely flexible, works with 
multiple actions (multi-modal), works for new users that have some history, has 
a scalable server (from the search engine) but is more difficult to integrate 
than #1 or #2. Uses your own ids and reads csv files.

The rest of the data seems to apply either to the user or the item and so would 
be used in different ways. #1 an #2 can only use user id and item id but some 
post recommendation weighting or filtering can be applied. #3 can use multiple 
attributes in different ways. For instance if category is an item attribute you 
can create two actions, user-pref-for-an-item, and user-pref-for-a-category. 
Assuming you want to recommend an item (not category) you can create a 
cross-ccoccurrence indicator for the second action and use the data to make 
your item recs better. #3 is the only methods that supports this.

Pick a recommender and we can help more with data prep.


On Nov 26, 2014, at 1:34 PM, Yash Patel <[email protected]> wrote:

Hello everyone,

wow i am quite happy to see so many inputs from people.

I apologize for not providing more details.

Although this is not my complete dataset the fields i have chosen to use
are:

customer id - numeric
item id - text
postal code - text
item category ´- text
potential growth - text
territory - text


Basically i was thinking of finding similar users and recommending them
items that users like them have bought but they haven't.

Although i would very much like to hear your opinions as i am not so
familiar with clustering,classifiers etc.

I found that mahout takes sequence files converted into vectors but i
couldn't understand how would i do it on my data specifically and more
importantly make a recommender system out of it.

Also i am wondering how to combine the importance of a specific customer
through the potential growth attribute.






Best Regards,
Yash Patel

On Wed, Nov 26, 2014 at 9:03 PM, Pat Ferrel <[email protected]> wrote:

> All very good points but note that spark-itemsimilarity may take the input
> directly since you specify column numbers for <UID><ITEMID><PREF_VALUE>
> 
> On Nov 26, 2014, at 11:43 AM, parnab kumar <[email protected]> wrote:
> 
> kindly elaborate... your requirements... your dataset fields ...and what
> you want to recommend to an user... Usually a set of item is recommended to
> an user. In your case what are your items ?
> 
> The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your data is not
> in this format which will let you use directly the algorithms in Mahout.
> 
> A little more info from your side will help us to give your the right
> pointers.
> 
> On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <[email protected]>
> wrote:
> 
>> Dear Mahout Team,
>> 
>> I am a student new to machine learning and i am trying to build a user
>> based recommender using mahout.
>> 
>> My dataset is a csv file as an input but it has many fields as text and i
>> understand mahout needs numeric values.
>> 
>> Can you give me a headstart as to where i should start and what kind of
>> tools i need to parse the text colummns,
>> 
>> Also an idea on which classifiers or clustering methods i should use
> would
>> be highly appreciated.
>> 
>> 
>> Best Regards;
>> Yash Patel
>> 
> 
> 

Reply via email to