Now I got it, thank you so much again.

But is it possible to encode it like that in Mahout because as far as I
understand I can only use one field for the ratings...  In this case, I
would have to run Userbased recommender system for example twice, once with
the data represented as (rating 1, dislike 0, and rating 0) and once more
with this representation (rating 1, dislike and rating 0)...

Thank you,
Regards,
Shady




On Sun, Oct 11, 2015 at 10:50 PM, Angus Macnab <[email protected]>
wrote:

> I just meant that you can encode the single categorical value that you
> have as two separate binary values.  You can accomplish this by asking two
> questions.  First you can ask: did user like the product?
>  (liked=1,dislike=0,no rating=0).  Next you can ask did the user rate the
> product? (liked=1,dislike=1,no rating=0).
>
> If this is your original data:
>
> Customer ID          Product ID           Rating
> 1                           1                        No Rating
> 2                           1                        Like
> 3                           2                        Dislike
> 4                           1                        Like
> 5                           2                        Like
> 6                           2                        Dislike
> 7                           1                        No Rating
> 8                           1                        No Rating
> 9                           2                        Dislike
>
> You could encode it like this:
>
> Customer ID          Product ID           Rating          Liked
>  Rated
> 1                           1                        No Rating           0
>                 0
> 2                           1                        Like
>    1                 1
> 3                           2                        Dislike
>  0                 0
> 4                           1                        Like
>    1                 1
> 5                           2                        Like
>    1                 1
> 6                           2                        Dislike
>  0                 1
> 7                           1                        No Rating           0
>                 0
> 8                           1                        No Rating           0
>                 0
> 9                           2                        Dislike
>  0                 1
>
> And train on this dataset:
>
> Customer ID          Product ID               Liked          Rated
> 1                           1                               0
>     0
> 2                           1                               1
>     1
> 3                           2                               0
>     0
> 4                           1                               1
>     1
> 5                           2                               1
>     1
> 6                           2                               0
>     1
> 7                           1                               0
>     0
> 8                           1                               0
>     0
> 9                           2                               0
>     1
>
> There is no need to ask a third question (did the user dislike product?),
> since the answer to this question can be linearly derived from the other
> two fields i.e. linearly dependent.
>
> Hope this helps to clarify things.
>
> Thanks,
>
> Angus
>
> On Sun, Oct 11, 2015 at 1:24 PM, Shady Hanna <[email protected]>
> wrote:
>
>> Thank you so much Angus for your help.
>>
>> I did not quite get it, so if I have the following data:
>>
>> Customer ID          Product ID           Rating
>> 1                           1                        No Rating (0,1)
>> 2                           1                        Like (1,0)
>> 3                           2                        Dislike (0,0)
>>
>> If what I understood is correct, how can I represent it to Mahout, and is
>> it going to be a boolean pref data model ?
>>
>> Thank you so much again,
>> Best Regards,
>> Shady
>>
>> On Sat, Oct 10, 2015 at 3:18 AM, Angus Macnab <[email protected]>
>> wrote:
>>
>>> Rather than try impose ordinality on your data, you can think of "like",
>>> "dislike", "did not rate" as a categorical feature with a cardinality of
>>> three, which can be encoded using two binary features.  All possibilities
>>> are fine, but the most logical is probably: rated=(0,1) and liked=(0,1).
>>>
>>> So you just need to come up with the routine to encode these features.
>>> Hope this helps!
>>>
>>> Best,
>>>
>>> Angus
>>> --------------------------------------
>>> Angus Macnab
>>>
>>> On Fri, Oct 9, 2015 at 3:54 PM, Shady Hanna <[email protected]>
>>> wrote:
>>>
>>>> Hi ,
>>>>
>>>> I have a data which is represented in like,user did not rate it, and
>>>> dislike, and I am not sure how I can represent this data to Mahout User
>>>> Based/Item Based Recommender System, and which user Similarity can be
>>>> used
>>>> for such dataset.
>>>>
>>>> Would you please advise ?
>>>>
>>>> Thank you,
>>>> Best Regards,
>>>> Shady
>>>>
>>>
>>>
>>
>

Reply via email to