Thank you very much for your opinion:)

In our case, maybe it 's dangerous to treat un-observed item as negative 
interaction(although we could give them small confidence, I think they are 
still incredible...)

I will do more experiments and give you feedback:)

Thank you;)


> 在 2015年2月26日,23:16,Sean Owen <so...@cloudera.com> 写道:
> 
> I believe that's right, and is what I was getting at. yes the implicit
> formulation ends up implicitly including every possible interaction in
> its loss function, even unobserved ones. That could be the difference.
> 
> This is mostly an academic question though. In practice, you have
> click-like data and should be using the implicit version for sure.
> 
> However you can give negative implicit feedback to the model. You
> could consider no-click as a mild, observed, negative interaction.
> That is: supply a small negative value for these cases. Unobserved
> pairs are not part of the data set. I'd be careful about assuming the
> lack of an action carries signal.
> 
>> On Thu, Feb 26, 2015 at 3:07 PM, 163 <lisend...@163.com> wrote:
>> oh my god, I think I understood...
>> In my case, there are three kinds of user-item pairs:
>> 
>> Display and click pair(positive pair)
>> Display but no-click pair(negative pair)
>> No-display pair(unobserved pair)
>> 
>> Explicit ALS only consider the first and the second kinds
>> But implicit ALS consider all the three kinds of pair(and consider the third
>> kind as the second pair, because their preference value are all zero and
>> confidence are all 1)
>> 
>> So the result are different. right?
>> 
>> Could you please give me some advice, which ALS should I use?
>> If I use the implicit ALS, how to distinguish the second and the third kind
>> of pair:)
>> 
>> My opinion is in my case, I should use explicit ALS ...
>> 
>> Thank you so much
>> 
>> 在 2015年2月26日,22:41,Xiangrui Meng <m...@databricks.com> 写道:
>> 
>> Lisen, did you use all m-by-n pairs during training? Implicit model
>> penalizes unobserved ratings, while explicit model doesn't. -Xiangrui
>> 
>>> On Feb 26, 2015 6:26 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>> 
>>> +user
>>> 
>>>> On Thu, Feb 26, 2015 at 2:26 PM, Sean Owen <so...@cloudera.com> wrote:
>>>> 
>>>> I think I may have it backwards, and that you are correct to keep the 0
>>>> elements in train() in order to try to reproduce the same result.
>>>> 
>>>> The second formulation is called 'weighted regularization' and is used
>>>> for both implicit and explicit feedback, as far as I can see in the code.
>>>> 
>>>> Hm, I'm actually not clear why these would produce different results.
>>>> Different code paths are used to be sure, but I'm not yet sure why they
>>>> would give different results.
>>>> 
>>>> In general you wouldn't use train() for data like this though, and would
>>>> never set alpha=0.
>>>> 
>>>>> On Thu, Feb 26, 2015 at 2:15 PM, lisendong <lisend...@163.com> wrote:
>>>>> 
>>>>> I want to confirm the loss function you use (sorry I’m not so familiar
>>>>> with scala code so I did not understand the source code of mllib)
>>>>> 
>>>>> According to the papers :
>>>>> 
>>>>> 
>>>>> in your implicit feedback ALS, the loss function is (ICDM 2008):
>>>>> 
>>>>> in the explicit feedback ALS, the loss function is (Netflix 2008):
>>>>> 
>>>>> note that besides the difference of confidence parameter Cui, the
>>>>> regularization is also different.  does your code also has this 
>>>>> difference?
>>>>> 
>>>>> Best Regards,
>>>>> Sendong Li
>>>>> 
>>>>> 
>>>>>> 在 2015年2月26日,下午9:42,lisendong <lisend...@163.com> 写道:
>>>>>> 
>>>>>> Hi meng, fotero, sowen:
>>>>>> 
>>>>>> I’m using ALS with spark 1.0.0, the code should be:
>>>>>> 
>>>>>> https://github.com/apache/spark/blob/branch-1.0/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
>>>>>> 
>>>>>> I think the following two method should produce the same (or near)
>>>>>> result:
>>>>>> 
>>>>>> MatrixFactorizationModel model = ALS.train(ratings.rdd(), 30, 30, 0.01,
>>>>>> -1, 1);
>>>>>> 
>>>>>> MatrixFactorizationModel model = ALS.trainImplicit(ratings.rdd(), 30,
>>>>>> 30, 0.01, -1, 0, 1);
>>>>>> 
>>>>>> the data I used is display log, the format of log is as following:
>>>>>> 
>>>>>> user  item  if-click
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> I use 1.0 as score for click pair, and 0 as score for non-click pair.
>>>>>> 
>>>>>> in the second method, the alpha is set to zero, so the confidence for
>>>>>> positive and negative are both 1.0 (right?)
>>>>>> 
>>>>>> I think the two method should produce similar result, but the result is
>>>>>> :  the second method’s result is very bad (the AUC of the first result is
>>>>>> 0.7, but the AUC of the second result is only 0.61)
>>>>>> 
>>>>>> 
>>>>>> I could not understand why, could you help me?
>>>>>> 
>>>>>> 
>>>>>> Thank you very much!
>>>>>> 
>>>>>> Best Regards,
>>>>>> Sendong Li
>>>>> 
>>>>> 
>>>> 
>>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to