No, input are weights or strengths. The output is a factorization of the
binarization of that to 0/1, not probs or a factorization of the input.
This explains the range of the output.

On Thu, Dec 15, 2016, 23:43 Manish Tripathi <tr.man...@gmail.com> wrote:

> when you say *implicit ALS *is* factoring the 0/1 matrix. , are you
> saying for implicit feedback algorithm we need to pass the input data as
> the preference matrix i.e a matrix of 0 and 1?. *
>
> Then how will they calculate the confidence matrix which is basically
> =1+alpha*count matrix. If we don't pass the actual count of values (views
> etc) then how does Spark calculates the confidence matrix?.
>
> I was of the understanding that input data for als.fit(implicitPref=True)
> is the actual count matrix of the views/purchases?. Am I going wrong here
> if yes, then how is Spark calculating the confidence matrix if it doesn't
> have the actual count data.
>
> The original paper on which Spark algo is based needs the actual count
> data to create a confidence matrix and also needs the 0/1 matrix since the
> objective functions uses both the confidence matrix and 0/1 matrix to find
> the user and item factors.
> ᐧ
>
> On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen <so...@cloudera.com> wrote:
>
> No, you can't interpret the output as probabilities at all. In particular
> they may be negative. It is not predicting rating but interaction. Negative
> means very strongly not predicted to interact. No, implicit ALS *is*
> factoring the 0/1 matrix.
>
> On Thu, Dec 15, 2016, 23:31 Manish Tripathi <tr.man...@gmail.com> wrote:
>
> Ok. So we can kind of interpret the output as probabilities even though it
> is not modeling probabilities. This is to be able to use it for
> binaryclassification evaluator.
>
> So the way I understand is and as per the algo, the predicted matrix is
> basically a dot product of user factor and item factor matrix.
>
> but in what circumstances the ratings predicted can be negative. I can
> understand if the individual user factor vector and item factor vector is
> having negative factor terms, then it can be negative. But practically does
> negative make any sense? AS per algorithm the dot product is the predicted
> rating. So rating shouldnt be negative for it to make any sense. Also
> rating just between 0-1 is normalised rating? Typically rating we expect to
> be like any real value 2.3,4.5 etc.
>
> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
> feed the count matrix (discrete count values) and am assuming spark
> internally converts it into a preference matrix (1/0) and a confidence
> matrix =1+alpha*count_matrix
>
>
>
>
> ᐧ
>
> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen <so...@cloudera.com> wrote:
>
> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi <tr.man...@gmail.com>
> wrote:
>
> Hi
>
> ran the ALS model for implicit feedback thing. Then I used the .transform
> method of the model to predict the ratings for the original dataset. My
> dataset is of the form (user,item,rating)
>
> I see something like below:
>
> predictions.show(5,truncate=False)
>
>
> Why is the last prediction value negative ?. Isn't the transform method
> giving the prediction(probability) of seeing the rating as 1?. I had counts
> data for rating (implicit feedback) and for validation dataset I binarized
> the rating (1 if >0 else 0). My training data has rating positive (it's
> basically the count of views to a video).
>
> I used following to train:
>
> * als = ALS(rank=x, maxIter=15, regParam=y, implicitPrefs=True,alpha=40.0)*
>
> *                model=als.fit(self.train)*
>
> What does negative prediction mean here and is it ok to have that?
> ᐧ
>
>
>
>

Reply via email to