Thank you very much for your opinion:) In our case, maybe it 's dangerous to treat un-observed item as negative interaction(although we could give them small confidence, I think they are still incredible...)
I will do more experiments and give you feedback:) Thank you;) > 在 2015年2月26日,23:16,Sean Owen <so...@cloudera.com> 写道: > > I believe that's right, and is what I was getting at. yes the implicit > formulation ends up implicitly including every possible interaction in > its loss function, even unobserved ones. That could be the difference. > > This is mostly an academic question though. In practice, you have > click-like data and should be using the implicit version for sure. > > However you can give negative implicit feedback to the model. You > could consider no-click as a mild, observed, negative interaction. > That is: supply a small negative value for these cases. Unobserved > pairs are not part of the data set. I'd be careful about assuming the > lack of an action carries signal. > >> On Thu, Feb 26, 2015 at 3:07 PM, 163 <lisend...@163.com> wrote: >> oh my god, I think I understood... >> In my case, there are three kinds of user-item pairs: >> >> Display and click pair(positive pair) >> Display but no-click pair(negative pair) >> No-display pair(unobserved pair) >> >> Explicit ALS only consider the first and the second kinds >> But implicit ALS consider all the three kinds of pair(and consider the third >> kind as the second pair, because their preference value are all zero and >> confidence are all 1) >> >> So the result are different. right? >> >> Could you please give me some advice, which ALS should I use? >> If I use the implicit ALS, how to distinguish the second and the third kind >> of pair:) >> >> My opinion is in my case, I should use explicit ALS ... >> >> Thank you so much >> >> 在 2015年2月26日,22:41,Xiangrui Meng <m...@databricks.com> 写道: >> >> Lisen, did you use all m-by-n pairs during training? Implicit model >> penalizes unobserved ratings, while explicit model doesn't. -Xiangrui >> >>> On Feb 26, 2015 6:26 AM, "Sean Owen" <so...@cloudera.com> wrote: >>> >>> +user >>> >>>> On Thu, Feb 26, 2015 at 2:26 PM, Sean Owen <so...@cloudera.com> wrote: >>>> >>>> I think I may have it backwards, and that you are correct to keep the 0 >>>> elements in train() in order to try to reproduce the same result. >>>> >>>> The second formulation is called 'weighted regularization' and is used >>>> for both implicit and explicit feedback, as far as I can see in the code. >>>> >>>> Hm, I'm actually not clear why these would produce different results. >>>> Different code paths are used to be sure, but I'm not yet sure why they >>>> would give different results. >>>> >>>> In general you wouldn't use train() for data like this though, and would >>>> never set alpha=0. >>>> >>>>> On Thu, Feb 26, 2015 at 2:15 PM, lisendong <lisend...@163.com> wrote: >>>>> >>>>> I want to confirm the loss function you use (sorry I’m not so familiar >>>>> with scala code so I did not understand the source code of mllib) >>>>> >>>>> According to the papers : >>>>> >>>>> >>>>> in your implicit feedback ALS, the loss function is (ICDM 2008): >>>>> >>>>> in the explicit feedback ALS, the loss function is (Netflix 2008): >>>>> >>>>> note that besides the difference of confidence parameter Cui, the >>>>> regularization is also different. does your code also has this >>>>> difference? >>>>> >>>>> Best Regards, >>>>> Sendong Li >>>>> >>>>> >>>>>> 在 2015年2月26日,下午9:42,lisendong <lisend...@163.com> 写道: >>>>>> >>>>>> Hi meng, fotero, sowen: >>>>>> >>>>>> I’m using ALS with spark 1.0.0, the code should be: >>>>>> >>>>>> https://github.com/apache/spark/blob/branch-1.0/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala >>>>>> >>>>>> I think the following two method should produce the same (or near) >>>>>> result: >>>>>> >>>>>> MatrixFactorizationModel model = ALS.train(ratings.rdd(), 30, 30, 0.01, >>>>>> -1, 1); >>>>>> >>>>>> MatrixFactorizationModel model = ALS.trainImplicit(ratings.rdd(), 30, >>>>>> 30, 0.01, -1, 0, 1); >>>>>> >>>>>> the data I used is display log, the format of log is as following: >>>>>> >>>>>> user item if-click >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I use 1.0 as score for click pair, and 0 as score for non-click pair. >>>>>> >>>>>> in the second method, the alpha is set to zero, so the confidence for >>>>>> positive and negative are both 1.0 (right?) >>>>>> >>>>>> I think the two method should produce similar result, but the result is >>>>>> : the second method’s result is very bad (the AUC of the first result is >>>>>> 0.7, but the AUC of the second result is only 0.61) >>>>>> >>>>>> >>>>>> I could not understand why, could you help me? >>>>>> >>>>>> >>>>>> Thank you very much! >>>>>> >>>>>> Best Regards, >>>>>> Sendong Li >>>>> >>>>> >>>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org