Hello Sean,

Thank you very much for the quick response.
That helps me a lot to understand it better !

Best regards,
Hiro

On Thu, Feb 25, 2016 at 6:59 PM, Sean Owen <so...@cloudera.com> wrote:

> This isn't specific to Spark; it's from the original paper.
>
> alpha doesn't do a whole lot, and it is a global hyperparam. It
> controls the relative weight of observed versus unobserved
> user-product interactions in the factorization. Higher alpha means
> it's much more important to faithfully reproduce the interactions that
> *did* happen as a "1", than reproduce the interactions that *didn't*
> happen as a "0".
>
> I don't think there's a good rule of thumb about what value to pick;
> it can't be less than 0 (less than 1 doesn't make much sense either),
> and you might just try values between 1 and 100 to see what gives the
> best result.
>
> I think that generally sparser input needs higher alpha, and maybe
> someone tells me that really alpha should be a function of the
> sparsity, but I've never seen that done.
>
>
>
> On Thu, Feb 25, 2016 at 6:33 AM, Hiroyuki Yamada <mogwa...@gmail.com>
> wrote:
> > Hi, I've been doing some POC for CF in MLlib.
> > In my environment,  ratings are all implicit so that I try to use it with
> > trainImplicit method (in python).
> >
> > The trainImplicit method takes alpha as one of the arguments to specify a
> > confidence for the ratings as described in
> > <http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html
> >,
> > but the alpha value is global for all the ratings so I am not sure why we
> > need this.
> > (If it is per rating, it makes sense to me, though.)
> >
> > What is the difference in setting different alpha values for exactly the
> > same data set ?
> >
> > I would be very appreciated if someone give me a reasonable explanation
> for
> > this.
> >
> > Best regards,
> > Hiro
>

Reply via email to