Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0

Ravi Mody Thu, 26 Mar 2015 08:00:04 -0700

After upgrading to 1.3.0, ALS.trainImplicit() has been returning vastly
smaller factors (and hence scores). For example, the first few product's
factor values in 1.2.0 are (0.04821, -0.00674,  -0.0325). In 1.3.0, the
first few factor values are (2.535456E-8, 1.690301E-8, 6.99245E-8). This
difference of several orders of magnitude is consistent throughout both
user and product. The recommendations from 1.2.0 are subjectively much
better than in 1.3.0. 1.3.0 trains significantly faster than 1.2.0, and
uses less memory.


My first thought is that there is too much regularization in the 1.3.0
results, but I'm using the same lambda parameter value. This is a snippet
of my scala code:
.....
val rank = 75
val numIterations = 15
val alpha = 10
val lambda = 0.01
val model = ALS.trainImplicit(train_data, rank, numIterations,
lambda=lambda, alpha=alpha)
.....

The code and input data are identical across both versions. Did anything
change between the two versions I'm not aware of? I'd appreciate any help!

Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0

Reply via email to